Before we begin this week, a quick programming note. For the next month or two at least, I’ll be switching this newsletter to going out every other week rather than landing in your inbox every Wednesday.
At some point I’d like to write a summary of what I’ve learned. I still haven’t quite gotten this newsletter to the place I’d like it, and I still plan to continue using it as a mechanism to write for an audience on a regular basis.
But for reasons I’m not quite ready to share with the whole world, I need to give myself a little more time between issues to write something that passes my internal bar for interestingness.
***
I tend to squirm a bit when someone asks me what my “superpower” is.
But I think one of them is finding great people. It’s always amazing to see how far people can go, and where my former colleagues and team members wind up. There’s something really amazing about getting an email from a former intern that they’ve landed an amazing job on the ML research team at a marquee tech company, or at a cool new startup.
People seem to recognize this, and so I often get asked how I figure this out. While I think there’s some intuition at play, I do think there are certain steps you can take that can help however much or little experience you have scouting for great people.
Some of these I’ve written about before. For example, I think consistent, structured interviews are vital for running a process that’s both effective and more equitable. Today, I wanted to contemplate a certain genre of question that I think needs to be retired. In particular the “tell me about a time…” format that’s the staple of interviews, technical and otherwise.
For my money, these questions screen more for luck and polish than they do for actual competencies or experience. They’re an indirect way to maybe get the information you’re actually looking for.
First and most critically, asking these kinds of questions introduces an unnecessary element of luck into the process. You can wind up screening for whether the person you’re interviewing happens to pick the right example to line up with what you’re screening for, rather than evaluating what you hope to evaluate.
Consider something like “Tell me about a time you had a bug to fix.” From the perspective of the hiring team, there are so many different skills that question could be evaluating. Are you looking for how someone prioritizes their work? How quickly they’re able to evaluate someone else’s code and spot a potential issue? Their ability to cope with pressure? How they communicate under pressure? How they deal with the less glamorous parts of the job, like fixing minor bugs?
If you ask this question and they pick the wrong example, they could be a great candidate who looks bad because you didn’t get the right case. Or you could have a mediocre person who gets lucky, happens to pick an incredibly relevant example, and sails through as a false positive.
Imagine a tiny bug that causes a major service outage. Maybe someone accidentally deleted a zero in a configuration, so you have 10 copies of a service instead of the 100 you need to serve your traffic. If suddenly 90% of your traffic is failing and customers are flooding your inbox with complaints, you shouldn’t be spending time digging through your logging and monitoring to figure out how many people are affected to triage the issue.
You’re probably going to get a lot more commentary about the struggle to trace down where the issue got introduced, or dealing with justifiably upset users. It might not be immediately obvious the problem is with your infrastructure (rather than, say, a piece of mangled logic in your source code), even if you know it’s a critical issue. Which is perfect if you’re screening for the ability to rapidly find issues. It’s not so useful if you’re screening for bug prioritization skills. The person who happens to pick this case has failed through bad luck, not because they don’t know what they’re doing.
Why not ask in a way that gets more directly at what you care about?
Second, these kinds of questions are easy to game, and so become more of a way to test someone’s polish (as sociologists would describe it) rather than someone’s actual temperament and skills.
The classic example here is a question of the genre, “Tell me about a time you failed and how you handled that.” Only a very clueless candidate is going to pick a place where they failed spectacularly and still haven’t really figured out why. If you’re savvy, you’ll have cherry picked an example that shows you made a vanilla mistake and how you coped with it. You’ll have rehearsed a polished answer that looks great to an HR team that — in my opinion naïvely if this is how they’re looking for it — is trying to screen for something like a growth mindset or humility.
Putting on my hiring manager hat, this isn’t especially useful information to evaluate a person’s skills. It’s a test of their interviewing skills and polish first, and maybe a weak proxy for their ability to learn and reflect second.
Last, these questions evaluate circumstance more than skills and judgement.
Imagine someone is applying for a job at an early-stage startup after spending 10 years at Google. You’re trying to figure out if this amazing machine learning engineer has any aptitude for deploying her own models.
Unless she’s incredibly senior, our hypothetical ML engineer had near-zero ability to influence how Google’s model deployment pipeline works while she was there. If you ask her “tell me how you deployed models,” and don’t like the deployment pattern, this is judging Google’s ML infrastructure team, not this candidate’s ability to design such a system.
How do we do better?
For once I’m going to advocate we steal an idea from the world of management consulting and suggest using case exercises. Or maybe less grandly and on a smaller scale, scenario-based questions.
By way of example, let’s return to our bug triage question from earlier. How might we translate that into a better, scenario-based question?
Most importantly, we have to start by making sure we know what this question is trying to assess. Let’s suppose what we really care about is this person’s ability to triage and communicate prioritization to the rest of the team. You might ask something like this:
Imagine you and your team shipped a bunch of new features for your login system yesterday. Then today, a user got in touch with customer support with a login issue, and you got asked to investigate and help prioritize. What are the first two steps you’d take? Why?
This overcomes the problem of relevance. You’ve steered them into a case where the prioritization is ambiguous. It’s one person complaining. There aren’t any alarm bells going off. Judgement is required. Unless someone gives a truly terrible answer (“let’s go fix the bug immediately,” say), you avoid going through experience or steps that aren’t relevant to what you’re actually trying to evaluate.
It makes it more difficult for a savvy candidate to coast on polish. By interrogating priority (not steps in general, but the two most critical first steps) and digging into their justification (the “why”), you get at critical thinking and understanding.
And it gets past the opportunity problem. You’re not judging the quality of a bug triage system someone else set up and that this person probably had to follow. You’re assessing how they might translate what they’ve learned into practice, free from the framing that steers them to recapitulate what they’re used to.
I sometimes get pushback that questions in this style discount experience. I don’t buy it. More experienced candidates who — critically — have actually internalized and understand what they’ve learned translate that experience into better answers to these scenario-based questions.
Moreover if you are explicitly screening for experience, I’d argue the best way to evaluate experience is looking at someone’s experience. If you don’t believe someone, a reference check is a far more effective way of evaluating truthfulness than trying to condense months or years of someone’s life into a handful of contrived questions.
If you were hiring a carpenter, you’d get the best sense of experience by looking at the last house or piece of furniture she’s built. I don’t know what you’re getting by asking her, “Tell me about a time you built a set of kitchen cabinets,” in terms of evaluating experience. If you’re asking to assess how well she can read a set of architectural drawings and translate that into a set of kitchen cabinets, I’d suggest you’re better off showing her a set of drawings and asking directly.
I won’t pretend that building a great recruiting process is as easy as nixing questions in the “tell me about a time…” format. Frankly, I think they can make sense sometimes. And asking a good scenario- or case-based question can matter as much as switching formats. As with so much else it’s the synthesis of a lot of other pieces that are built and executed the right way.
But I do think getting into this frame of mind can help a lot. By switching to better questions, you’re forced to figure out what you want and assess it more directly. It’s less a question of luck. Instead of gambling that the candidate chooses the right example, give them a situation that elicits what you care about. It’s less a question of polish and opportunity. Instead of judging a situation they have no control over, give them the space to explore the ideas you care about. It’s a subtle change that I think can make a big difference.
Enjoy this? Have an idea for something you’d like a perspective on? Drop me a line: I’d love to hear from you.