Increasing Your Accuracy

Evaluate Each Piece Separately

Matthew Walsh
September 05, 2023

A cornerstone of our hiring philosophy is the Target: a document with a discrete list of Results Expected and Competencies.

Taking the time to create this document has many benefits that we’ve discussed before: getting internal alignment, informing your external communications to candidates, creating a divide-and-conquer plan for your interviewing team, and giving people separate, specific categories to rate the candidate on.

It’s that last point that I want to focus on today. In their excellent book Noise, Kahneman, Sibony, and Sunstein discuss why it’s so important to create a list of what you are looking for when hiring.

Their main point is that if you don’t, you are largely at the whims of the idiosyncratic views of individual interviewers, each of whom is expected to render a holistic judgment on a candidate in the span of 45 minutes or so of interviewing. This is one main reason that the typical interview process is almost worthless.

By dividing the Target and coordinating your data gathering as a team, you can go much deeper in your data elicitation. But there is a subtle corollary here: you must review, evaluate, and discuss each of the Facets separately, prior to rendering your final decision.

Perhaps this sounds obvious (“of course that’s how I would do it.”) but that’s not how humans make decisions with their “Type 1” thinking—instead the judgments are quick, intuitive, and holistic. Essentially, it is very easy to jump to a “yes” or “no” about the candidate as a whole and then fit your ratings across various Facets to justify that pre-ordained decision.

What you need to do instead is to review your data Facet by Facet and assign a rating according to an agreed-upon-calibration scale that you and your teammates are bought into. And then be able to back that rating primarily by objective data—not your subjective impressions of the candidate.

If you don’t discipline yourself—and your team—to do this you will fall victim to the “halo” effect where you will be tempted to rate candidates high (or low) across the board either because of bias (because you like them) or path-dependence (because they answered the first question well, or because you scored them highly on their first Facet, or because the first interviewer liked them, etc.).

The primary psychological reason for this is that people love to minimize their cognitive dissonance. It “hurts” a little to like a candidate and yet rate them lower on a given Facet. It’s likewise difficult to admit that a candidate you didn’t particularly get along with deserves a high rating in a certain aspect of the role. And there is some internal resistance in having a hodgepodge collection of ratings—some high, some low—for a given candidate. Because it would be much easier if candidates were just “good” or “bad” it’s very easy to start scoring your Facets all together instead of truly separately. This eases your cognitive burden but results in lower accuracy.

In the candidate roundtable discussion (where you ultimately collaborate as a team to reach a decision) it’s crucial to structure the conversation around the individual ratings—one Facet at a time. Otherwise the conversation will immediately drift towards a holistic “yes” or “no” lens and interviewers with dissenting data—especially junior interviewers—will start soft-peddling their stance so as to not “rock the boat.” Going down the list one rating at at time helps minimize this threat of groupthink.

(Obviously this is most critical on candidates that have some chance of being hired—if someone scored 1’s and 2’s across the entire Target, no discussion is necessary.)

Here’s the checklist to make sure you and your team don’t fall into that extremely common trap:

Do you have a Target that the team is aligned on and that everyone understands?
Does each interviewer know their specific role and the key Facets they are focused on?
When reviewing the notes, does each interviewer score each of their individual Facets separately from one other?
Do they have to back up their ratings with objective data and against a calibration scale?
Are teammates empowered to call them out if they don’t?
Do you ensure that people cannot see each others’ ratings until all ratings have been submitted?
During candidate roundtables, do you structure the conversation to discuss each individual Facet before moving to the larger group discussion/vote about whether or not to hire them?