During a recent company review cycle, you notice that most feedback for each person is pretty uniform -- and, based on their self-reviews, individuals seem pretty self-aware. But as you look more closely, you discover that the actual scores recorded for each review varied greatly based on who was giving them.
You realize that the scores are so different because individual reviewers thought of the scoring systems differently:
In fact, Manager B gave lower scores to their direct report, not as an act of harshness, but out of the very well intentioned sense that every individual -- not to mention the company itself -- has room to grow. This is one of the potential cons of ratings in performance reviews, and part of the reason that HBR cautions that performance evaluations are biased.
While this difference in interpretation makes sense, now that you’ve identified it, it’s time to figure out how you iron out these differences. Is it altering your performance review questions, changing your scoring system, or something else?
The simple answer is review calibration. In review calibration, the people teams train and teach managers to apply a company standard when reviewing their direct reports’ performance at work. Calibration sessions are meetings that occur after a review cycle to enact these policies.
If you have scores as part of your performance review, then you might want to set your company up for review calibration for both the product you’re using, and for generally training managers. Here's how to do it.
People teams should give managers a framework for how to score their direct reports. That means three things: picking a scale, explaining what each number means, and figuring out what the distribution of scores may be.
First, you can pick a scale from 1 to 5, 1 to 7, 1 to 10 — whatever works for your company.
Second, define what each point means. It’s likely variations on low, average, and high performance, but it’s important that everyone in the company have the same definitions. For example, all managers should know what a “5” means on the scale you provide -- whether it’s mid-level performance, solid with potential for growth, exceeds expectations, or very high performance.
A note on defining terms: For mid-level performers, be careful using terms like “average” or “medium,” as these terms can influence managers’ scoring styles. If you call the middle number “average,” it can seem like a harsher judgment than it is. Terms like “solid,” “standard,” “potential reached,” and “meets expectations,” get to that meaning without the potential negative connotations.
Lastly, ask managers and leadership what they expect the distribution of scores will be -- how many high performers and how many low performers might they expect? When scores are distributed equally, the standard distribution of employees should be some high performers, a few low performers, and mostly mid-level performers.
Make sure to emphasize that managers will have to explain their scoring in calibration sessions after the review.
There are two main ways to calibrate scores after the fact, and they depend on the size of your organization.
In one case, people teams should meet with all managers to understand their reasons behind scoring. By getting a wide range of reasons across departments, they’ll be able to understand whether a manager leans too far into lenient or strict scoring of their direct reports, or whether they’re usually “just right.” You also want scores adjusted by the people who will have the most information on how the company is doing generally, so they have a strong sense of the overall performance distribution of the company.
People teams can also get more information on managers’ scoring styles by discussing these scores with managers’ managers, who typically have a clearer perspective on how that manager will approach scores.
The way to know whether the above works for you is answering a very simple question: how many meetings do you think your people team can handle?
Of course, the more data you can gather on managers’ scoring styles, the better -- but there is a turning point when it doesn’t make sense for the people team to do that. Instead, it’s better to form calibration committees, composed of leadership and managers’ managers, who know managers’ managing styles well enough to know how their scores might need to be adjusted.
Also, research has shown that calibration committees have a longstanding effect on how managers provide ratings. Having seen how their scores were adjusted, managers tend to be more careful and thoughtful the next review cycle.
Based on these discussions, and armed with the knowledge that each score has been vetted, the people team can adjust scores accordingly.
The final step of review calibration is using the information gathered both pre- and post-review to change scores accordingly. With an employee performance management system like Lattice, that can be as simple as downloading, tweaking, and uploading these calibrated scores into the system.
Review calibration is an essential step when holding performance review cycles with scoring. It makes managers’ jobs easier, makes employees’ performance evaluations more fair and honest, and accomplishes the people team’s task to make sure that all performance review scores are given based on the same standard of performance.