Every Professional Test Is an Elimination Game. One Sci-Fi Robot Actually Got the Design Right.
The movie opens on a girl being raised underground by a robot.
The last human alive, she’s been told. The robot calls itself Mother. Every few weeks, it administers a test — academic, physical, surgical. The girl practices incisions before she’s a teenager. She works through philosophy problems. She trains her hands to stay steady under pressure.
She doesn’t know that failing means being incinerated.
What she does know — because Mother makes her believe it — is that the work matters. That reaching the next standard is worth the effort. That continuous learning has a point.
Watching I Am Mother, I kept thinking: this is dystopian in its stakes. But the assessment design is more valid than most of the professional tests I’ve encountered in real life.
Growing up, every major exam had one explicit purpose: elimination. Your score determined which tier school you attended. Tier 1, 2, or 3. The test wasn’t measuring what you could do — it was sorting where you’d go. Everyone understood this. The pressure had nothing to do with learning and everything to do with not being filtered out.
Coming to college in the US felt different. Rubric-based, professor-by-professor, more room to breathe. And then the MCAT arrived.
The MCAT is built to gatekeep. It does that reliably. But what it actually tests — reading comprehension speed, test-taking endurance, the ability to eliminate wrong answers under a tight clock — has a complicated relationship with what makes someone a good physician. The qualities that matter most clinically: staying composed when a patient deteriorates, catching the detail that doesn’t fit, knowing when to keep asking questions instead of accepting the obvious answer. None of those surface in a multiple-choice exam.
I’ve seen the same pattern with the LSAT, the bar, and most high-stakes standardized tests I’ve encountered since. Reliable at filtering. Often measuring the wrong thing.
Back to the robot.
Mother’s tests were extreme and the stakes were fiction. But the design intent was coherent: the girl was being trained to become a surgeon and make life-critical decisions under pressure. She was evaluated on performing surgery and making decisions under pressure. The assessment matched the actual job.
That’s not what most professional certification does.
I have a few tech certs. So do most people in this field. And I’ve sat in enough hiring panels to know what they signal: that someone completed a prep course and can navigate multiple choice under a time limit. What they can’t tell you — whether someone can debug a production environment at 2am, read a customer’s architecture and immediately locate the fragile point, or keep a technical conversation together when the demo breaks in front of an executive.
The cert proves you passed the test. It says almost nothing about whether you can do the work.
Organizational psychologists have been making this argument for decades. Work-sample assessments — simulations of the actual job — consistently outperform standardized tests in predicting real job performance. A landmark meta-analysis covering 85 years of selection research found direct performance samples are among the strongest predictors of on-the-job success. We’ve known this for a long time. The elimination games keep running anyway.
They produce numbers. They sort cleanly. They scale. And they consistently mistake the filter for the thing being filtered — screening out people who are weak at test-taking, not people who are weak at the actual job. They end careers before they start. They teach everyone who goes through them that performance under artificial conditions is what actually counts.
The robot’s method was inhumane. But the design logic — build the assessment around the real work — is exactly what professional evaluation should be aiming for.
Most of it still isn’t.
Today is my birthday. And there’s something about adding a year that makes you look back at every gate — the ones you cleared, the ones you didn’t — and wonder whether any of them measured what actually mattered.
The tests I passed didn’t make me who I am. The moments that did were harder to score: the recital I prepared two years for. The customer call that went completely off-script. The mentor who changed how I thought, not just what I knew.
On a day like today, the question I keep coming back to isn’t what I’ve achieved. It’s whether I’ve been building the kind of competency that actually matters — to this work, to the people around me, to the problems worth solving.
That’s the only assessment worth passing. And I keep thinking about everyone who would have passed it — if only someone had designed it around the actual job.

