Evaluating Developers

The differences among software developers are much more subtle and detailed than categorizing them as poor, good, great or even rating them on a 1-10 scale. There are plenty of capable developers and each one has a unique combination of skills and abilities that you will probably never see again. Understanding a developer’s skills at a useful level of detail is very difficult, and usually goes well beyond what can be gleaned by someone who does not work closely with him or her.

Evaluating developers is a particularly important problem when you’re hiring, staffing or augmenting a project, or back-filling roles. Common tools for evaluation, such as resumes, degrees, and certifications are not great at capturing detail. I have worked with plenty of people with great resumes, are well educated, and have lists of certifications but are not capable of doing the jobs they are employed to do. The reference check has potential to be a good tool, but not unless it’s treated as more than a formality, and about more than just a developer’s behaviour or cultural fit. Technical tests can be useful if they are designed well. They can easily become pedantic and detailed and not test what is important.

So just how do you evaluate developers? There’s certainly not enough time in life to work with everyone closely enough to give a confident assessment of their capabilities. If you are hiring, the risk of making a mistake is significant. You are probably better off not hiring at all than hiring someone who is not up to the job.

My advice in one word is: “ask”. Ask the developer. Ask the developer’s colleagues, managers and references. Receiving a resume from an applicant is not you asking. It’s them telling. You need to ask. That’s easy enough for me to say, but how do you do that? How do you get useful data?

One of my current tools for evaluating technical ability is to define a set of meaningful general categories such as ‘database’, ‘web’ and ‘middleware’. They might vary based on your needs. Break each category into a set of specific items that are relevant to you. In the case of database, some of these items might be ‘SQL’, ‘performance tuning’ and ‘NoSQL’. With a couple dozen or so items, you can now apply ratings. Scale doesn’t matter. Use whatever you like. You can rate the developer based on their own opinion or the opinion of others, or even your own opinion based on time spent working with him or her.

This categorization tool doesn’t work well in an interview. The session is often too short, and there are too many factors on which you might want to evaluate someone. In interviews, I use other types of tools to get an impression on something very specific. One tool I use is the ‘code smell test’. In this test, I show a potential hire some horribly written code and ask him or her to point out the flaws. The beauty of this test is that I can set different expectations for different experience levels. A new graduate might only find 20% of the problems, while someone with years of experience might find 50%. Each of these outcomes is acceptable. These numbers might seem low, but keep in mind that this is a technical interview, under short time constraints and considerable stress. If a new graduate gets 5% or an experienced developer gets 15%, then you need to worry. The problems that I inject into the code include glaring runtime exceptions, design problems, inefficiencies, poor use of style, and incorrect logic. There is always some ambiguity in a test like this. Some problems are objectively wrong, while others can be argued. In some cases, I have had interview candidates point out flaws that I did not even intend. I don’t quibble with an interviewee over details.

Another tool I use is the programming test. You may have heard of FizzBuzz. I don’t use that one, since it can be easily Googled. It’s safer to create an original test. And it’s a good idea to run it by some developers or even current employees to make sure it’s not too hard or too easy.

A final tool I use in interviews is the ‘pick a question’ tool. In this case, I show a list of questions to a candidate, and ask him or her to choose one or more to answer or comment on. You can learn a lot from both the response and the selections that the candidate makes. I typically only use this one for more senior jobs, but there’s no reason why it couldn’t be used or modified for use for entry level or junior positions. Sometimes you will be surprised. Someone will give you an answer that you won’t expect, or will get an answer ‘correct’ even though it is not their particular area of expertise.

If multiple sources give results that are significantly different, investigate those further. If it is from a reference check, maybe the person who rated your candidate based it on an old collaboration. Or, more worrying, the candidate is misrepresenting his or her abilities. The more important a particular ability is to the reason you are evaluating the person, the more focus you should put on determining the accuracy of those results.

Leave a Reply

Your email address will not be published. Required fields are marked *