Software Engineer Interview Experiences

Background

So… last Friday (March 24th) was sadly my last day with the Boston Public Library. I start at Akamai tomorrow (March 27th) as a Senior Software Engineer. I will deeply miss the Boston Public Library and the work my colleague and I completed building the Digital Commonwealth system.  But unfortunately all things must end and I felt it was time to move on to a new opportunity. The following are some interview experiences that I believe might be interesting from my recent search.

The New HR Phone Screen Question

A 30 minute initial call from a HR representative has been in practice for quite some time now. However, what I did find new was how aggressively one will be required to give one’s salary requirements at this time. Nearly every place I applied required me to state the salary I was looking for and refused to allow me to weasel out of giving a number.

From the HR point of view, this is a smart move. With the interviewee not having interacted with the team yet, the interviewee loses the leverage of being the “first choice” in such a negotiation. It doesn’t matter how amazing of a match one is for the job or how much unexpected value above and beyond the job posting one might bring since those aren’t the HR representative’s primary concern. They are focused at filling X position for Y salary and they can make the cold calculation to eliminate the candidate up front by forcing this simple question.

I’m quite unsure of how to best handle the moving of the salary negotiation from the end of the interview process up to the forefront. A talk on negotiating salary was given recently at Code4Lib but lacked any advice on this situation. For example, does one give a low number to keep one in the running and then surprise them with a higher requirement at the end? After all, I had a few HR representatives from potential position thank me for my time once my number was outside of their range – including one job that they won’t likely ever find a better match for. In these cases, they mostly didn’t even bother to ask if I might be willing to work for less than the number I gave. Or does one start off with a high number on the expectation they could offer a slight bit below that in the end and just give up on those that one will be eliminated from immediately? Beyond just doing one’s research and being prepared to give a number, I can’t really give further advice on how to handle this.

Assembly Line Programming

Let’s say one had a questionnaire with responses about where people live. If one wanted a report for all those who responded from Massachusetts, one would want to include those that responded with “Massachusetts” obviously. But, due to hierarchy, one would also want to include those that were more specific and answered “Boston” or “Cambridge”. That makes sense, right?

So when I was to do a code sample for a medical crowdsourcing site, I quickly saw what I wanted to do. Their site had little knowledge of medical hierarchy that looked to be an obvious flaw. Those that responded with a symptom of “Myocardial Infarction” (commonly known as Heart Attack) were the only ones being included on a report for that symptom. Those the reported more specific versions of “Heart Attack” like “Non-STI Myocardial Infarction” would not show up. Similar to the location example above, this seemed to be a problem. Beyond coding up a sample of how to fix this, I came armed to talk about it during the in-person interview.

Only throughout the in-person interview process, I was the only one who considered this an issue. The argument against it generally boiled down to the fact they got their requirements from a group of medical scientists. I seriously cannot figure out a counter-argument beyond that since they could fix their system with about a week of a single developer’s time that would make the data much more usable. Further questioning about conferences that developers generally attended there were devoid of anything to do with the medical field. They were hired to implement checkboxes from a requirements document – to handle just the translation from pages of paper to digital ones and zeroes. Ability to understand what it is they are implementing and taking action to make the system work well? Not a desired skill.

Of course, perhaps I’m incorrect in my assessment of how medical hierarchy can be used and there is some valid reason that makes it different than my initial location analogy. I cannot come up with one and it frustrates me to this day how it makes so much sense to me but how much it bombed talking about it with people at that organization. (There were other things wrong with that place as well. Such as being told that making their data shareable wasn’t a priority they cared about since others can just deal with their custom format, how they didn’t like to collaborate with others in the medical world because they viewed themselves as doing things better than everyone else, etc).

This wasn’t just a one-off situation but rather the norm. Another position I interviewed for would have dealt with anti-virus protection. When I asked what conferences the team lead interviewing me attended, he seemed to be confused. I repeated it and clarified it being security conferences to keep up on the latest trends, and he responded that he didn’t attend any. Developers there just don’t go to those. This lack of specialist knowledge really showed in their answer to how their product differed from the competition. Both him and another I interviewed with touted a single killer feature that they claimed no other product on the market had. Only that had already been added to other security products recently – including one suite my girlfriend sells. I didn’t bother to correct their lack of domain knowledge.

This trend towards an assembly line where developers aren’t expected nor desired to understand the product is disturbing to me. I will never understand how one can create anything truly exceptional from such a philosophy. Of course, part of that is fueled by my own preferences that bias my opinion, but valid improvements can come from more than just product managers or researchers understanding the domain.

Government Employees Being Viewed As Lazy

In one phone screen, I was asked if I thought I was up to working at a startup. That I would have to “actually do work” and “really get things done” which is different than working for the government.

At another job, when asked about what advice at working at their company, an interviewer asked me to remind them where I was currently employed at again. After responding that I was employed by the Boston Public Library, he went into a spiel about how it is easy to just “blend in without doing much work” at such a place. A similar theme to the above, he stated how one “is responsible for making things work since no one else will for you”.

Sitting through both of those talks was nothing sort of infuriating. It didn’t help that I could have presented my background better by linking it to terms they knew – a flaw my girlfriend pointed out. She advised that I should have pointed out whenever possible that my position was much like an early stage startup and that my colleague and I had to wear many hats and put in quite a great deal of work to get things off the ground. Regardless, it seems that the war on the competence of government employees has hit even the liberal bastion of Boston, Massachusetts. That being a Software Engineer who chooses to do public service must mean that was the only job one could land with one’s limited skills or that one must just be lazy. By remaining in the public sector, I’m viewed as being less competent by private sector peers that impacts how one will be judged during an interview and will make taking the next step in one’s career more difficult.

Hired.com

Beyond normal interview sites, I did give hired.com a try. I ended up getting four interview offers from the site that I would not have found on my own. Essentially a quick note that it seems to work well as an option to try even if it didn’t end up leading to an offer myself. (The job I ended up accepting was referred to me by a friend of a friend).

Accessing Excel Spreadsheet Files for Batch Uploads of Digital Objects:

As it was asked how we handle reading content from Excel, this is a very quick blog post that goes over what we do for that. First you will need the following added to your Gemfile and then bundle install:

gem 'roo', :git => 'https://github.com/roo-rb/roo'
gem 'roo-xls', :git => 'https://github.com/roo-rb/roo-xls.git'

Now that we have the library that will allow us to read the spreadsheet, we can go ahead and setup a variable to hold the content of the spreadsheet. This assumes you have a “sheet_location” variable set that indicates where the file you are trying to read lives (be it uploaded or not) and assigns the content to @worksheet:

if sheet_location =~ /\b.xlsx$\b/
  @worksheet = Roo::Excelx.new(sheet_location)
elsif sheet_location =~ /\b.xls$\b/
  @worksheet = Roo::Excel.new(sheet_location)
elsif sheet_location =~ /\b.csv\b/
  @worksheet = Roo::CSV.new(sheet_location)
elsif sheet_location =~ /\b.ods\b/
  @worksheet = Roo::OpenOffice.new(sheet_location)
end

@worksheet.default_sheet = @worksheet.sheets.first #Sets to the first sheet in the workbook

The next thing we want to do is grab the header row that has our column headers. At the BPL, this is the third row in the spreadsheet (previous rows are for notes). As such, with the library using “1” as its first row, we would get the header row via the following:

header_row_index = 3
@header_row = @worksheet.row(header_row_index)

From here, I loop through each data row in my spreadsheet (which starts at index 5 for us) and pass that row value along with the header row to a method to process that row. It looks something like:

data_start_row = 5 
data_start_row_index.upto @worksheet.last_row do |index| row = @worksheet.row(index) 
  if row.present? && @header_row.present?
    begin 
      process_a_row(@header_row, row) 
    rescue Exception => e 
      #Exception handling for when encounter bad data... 
    end 
   end 
  end
end

Now we have each row in our spreadsheet being processed! But… how do we access each individual cell? In our case, our spreadsheet template has over 150 possible headers and having a spreadsheet with every header becomes unwieldy. As such, each one has some combination of potential headers and the order of those headers in the spreadsheet is not guaranteed. So we end up with something like the following to get the value of “title” out of our spreadsheet:

def process_a_row(header_row, row_value)
  # ...
  title = find_in_row(header_row, row_value, 'title_primary')
  # ...
end

Essentially this is calling a method called “find_in_row” from within the “process_a_row” block and adds a third argument of the row header identifier we are using to find that data element. The “find_in_row” method then looks like:

def find_in_row(header_row, row_value, column_identifier)
  0.upto header_row.length do |row_pos|
    case header_row[row_pos]
      when column_identifier
        return strip_value(row_value[row_pos])
    end
  end
  return nil
end

This has another new method: strip_value. The plan is to move this function into “Bpl_Enrich” in the near future but essentially this is to return our data elements as UTF-8 strings. The code for this looks like:

def strip_value(value)
  if(value == nil)
    return nil
  else
    if value.class == Float
      value = value.to_f.to_s
      value = value.gsub(/.0$/, '') #FIXME: Temporary. Bugged as see: https://github.com/roo-rb/roo/issues/86 , https://github.com/roo-rb/roo/issues/133 , https://github.com/zdavatz/spreadsheet/issues/41
    elsif value.class == Fixnum
      value = value.to_i.to_s #FIXME: to_i as otherwise non-existant values cause problems
    end

    # Make sure it is all UTF-8 and not character encodings or HTML tags and remove any cariage returns
    return utf8Encode(value)
  end
end

def utf8Encode(value)
  return HTMLEntities.new.decode(ActionView::Base.full_sanitizer.sanitize(value.to_s.gsub(/\r?\n?\t/, ' ').gsub(/\r?\n/, ' '))).strip
end

We can now repeat the calls to “find_in”row” within the “process_a_row” method for each of our data elements and insert them into our system as needed. But what do we do for multi-valued fields? We use a deliminator of a double pipe (“||”) to delineate values in those cases. For example, if our title was allowed to be multi-valued, we could have the above return “title1||title2″. There is then another helper function to convert that into an array as the following:

def split_with_nils(value)
  if(value == nil)
    return ""
  else
    split_value = value.split("||")
    0.upto split_value.length-1 do |pos|
      split_value[pos] = strip_value(split_value[pos])
    end

    return split_value
  end
end

Why do I return “” on the nil case? To make processing easier for related pairs of multivalued column fields when doing indexing without an extra logic check in the inserts of “process_a_row”. As a full example, assume we have titles of “title1||title2||title3″ and title types of “primary||||alternative” (ie. the second title lacks a type in this made up example of bad MODS data). You would do something like:

title = find_in_row(header_row, row_value, 'title')
if title.present?
  title_list = split_with_nils(title)
  title_type_list = split_with_nils(find_in_row(header_row, row_value, 'title_type'))
  0.upto title_list.length-1 do |pos|
    @digital_object.descMetadata.insert_title(title_list[pos],title_type_list[pos])
  end
end

Of course, in the insert title method, you would need to check for blank values to not insert the empty title type value for the second title in our list. If there was no “title_type” specified at all as that was omitted as an optional field, our function would still work as indexing an empty string (the returned “” from split_with_nils) would just give us all blank values for the title_type as we index through it.

I hope this was somewhat useful, not a completely crazy approach, and made some sense. Take care!