You’ve seen it before. The facilitator says “reveal,” everyone flips their cards, and it’s a wall of 5s. Sometimes 8s. Occasionally a lone dissenting 3 that gets quickly outvoted. The team nods, moves on, and you burn your sprint because the task was actually a 13.
This isn’t incompetence. It’s basic human psychology operating exactly as intended — just in the wrong context.
The Anchoring Effect in Estimation
The anchoring effect is one of the most robust findings in behavioral economics. When people make numerical judgments under uncertainty, the first number they see or hear disproportionately influences their final answer.
In planning poker, this plays out in a specific way. Even though everyone writes their number “secretly,” there are almost always implicit anchors:
- The previous ticket: If your last story was a 5, the current one gets pulled toward 5.
- Who speaks first: Whoever mentions the ticket description last often embeds an implicit number (“this should be quick” vs. “this touches a lot of systems”).
- Status in the room: Junior devs unconsciously adjust toward what they think the senior engineers will pick.
The act of simultaneously revealing cards is specifically designed to prevent anchoring — but it only solves half the problem. The other half is that people still discuss after the reveal, and discussion under social pressure produces consensus, not accuracy.
Groupthink Isn’t a Bug, It’s a Feature (in the wrong place)
Teams that work well together tend to develop strong cohesion norms. These are useful: shared mental models speed up delivery, reduce conflict, and create psychological safety. But the same mechanism that makes your team cohesive makes them terrible at estimation.
When a dissenting voice raises a 13 in a room full of 5s, the social cost of defending that position is real. The dissenter either has to explain themselves convincingly or quietly revise downward. Most people revise. Not because they changed their technical assessment, but because being the outlier is uncomfortable.
The result: your estimates reflect your team’s social dynamics more than the actual complexity of the work.
What’s Actually Hard to Estimate
There’s a second problem layered on top of the psychological one: story points measure the wrong thing.
Traditional planning poker produces a single number that supposedly represents “complexity” or “effort.” But a ticket can be small-effort, high-risk. Or straightforward technically but uncertain in requirements. A 5-point story that has a 40% chance of blowing up is fundamentally different from a 5-point story you’ve done ten times before.
When you collapse all of this into a single number, you lose information. And when you lose information, estimates converge toward the mean because nobody knows what they’re actually estimating.
How Structured Multi-Dimensional Estimation Helps
One approach that addresses both problems is breaking estimation into explicit dimensions before arriving at a final number. Instead of “how big is this ticket?”, you ask four separate questions:
Effort: How much time and resource does this require? Is it three lines or three days?
Risk: What external factors could derail this? Dependencies, third-party APIs, infrastructure changes?
Complexity: How hard is this to understand? Technical complexity and domain complexity are both real.
Uncertainty: How many unknowns are there? Has the team touched this code before? Is the requirement clear?
When team members vote on each dimension separately, something interesting happens. The discussion becomes specific. Instead of “I think it’s a 5,” you get “I put Risk as high because we’re touching the payment service, but Effort is low since it’s just a config change.” That disagreement is useful information. It exposes exactly what different team members are worried about.
Making Blind Voting Actually Work
Simultaneous reveal matters, but it’s not enough on its own. A few practices that compound its effectiveness:
Don’t discuss before the first vote. Let everyone form their independent judgment before any discussion. Even a comment like “this seems related to the auth refactor” is an anchor.
Make disagreement the goal, not the problem. When cards differ, that’s the system working. The spread tells you what’s uncertain, what’s risky, and where the team has different mental models of the codebase.
Vote on dimensions before the final number. If your team scores Effort, Risk, Complexity, and Uncertainty independently, the final story point becomes an aggregation of specific assessments rather than a social negotiation.
Don’t run estimation meetings when the team is tired. Cognitive load amplifies groupthink. When people are depleted, they default to social consensus faster.
The Uncomfortable Truth About Estimation
Perfect estimates don’t exist. The goal of planning poker isn’t accuracy — it’s calibrated uncertainty. You want to know not just what you think a ticket will take, but how confident you are in that assessment.
A team that says “this is a 3, and we’re very sure” is giving you useful information. A team that says “this is a 3, but we have no idea about the backend implications” is giving you even more useful information. The second team is better at estimation even if they’re wrong more often, because they’re flagging where the risk is.
The same-number problem is ultimately a signal that your team isn’t surfacing uncertainty — they’re hiding it behind consensus. The fix isn’t a better poker deck. It’s a better question.