The code compiled. The tests passed. And it was still wrong.
I spent the morning reading two hundred lines that AI had written in a minute. They looked good. Suspiciously good. The structure was clean, the names made sense, everything in its place. It took me an hour to find where it was quietly lying, a single assumption in the edge-case handling that would blow up on exactly the data that never makes it into the tests but always makes it into production.
Turns out I'm not alone in that feeling. According to the surveys, the majority of developers, roughly two thirds, are most aggravated not by AI that's obviously wrong, but by AI that's "almost right, but not quite." And that number says something far more important than it appears to.
Dangerous code doesn't look wrong
There's a difference between a bug that screams and a bug that whispers.
The old bad code screamed. The new "almost right" code whispers, looks plausible, passes review, passes the tests, and then blows up at three in the morning.
Old bad code was easy to spot. It didn't compile, it broke loudly, it turned the logs red. The new "almost right" code does the opposite: it looks entirely plausible. It slips past a tired reviewer's eye. It passes the tests that cover the obvious. And it waits. Then it blows up at three in the morning, at the edge of some case nobody thought of.
The machine is brilliant at manufacturing plausibility. That's precisely why it's more dangerous, not less. The research confirms it in an uncomfortable way: even the strong models rarely see more than half of real debugging tasks through to the end. They write the code confidently. They don't necessarily understand what they wrote.
Value has shifted from writing to judgment
If we take this seriously, something important follows for how we hire.
Until recently, the valuable engineer was the one who wrote fast and clean. Today typing is a commodity, the machine does it cheaper than anyone. The valuable one is the one who knows when the machine is lying. The one who holds in their head how the system actually works, not just how it looks; who senses when something "smells" wrong before they can explain why.
That's a skill of a different order. It isn't about syntax, it's about thinking at the level of the system. Why this architecture exists. What happens when this assumption isn't true. Where it will hurt when the load doubles. These questions don't get automated, because they demand understanding, not pattern recognition.
That's why we hire skeptics
We look for people with healthy skepticism. Skepticism toward AI, of course. But also toward us, and toward their own first instincts.
I don't want an engineer who accepts the machine's suggestion because it looks good and saves time. I want an engineer who reads the two hundred lines, finds the quiet lie, and says: "this compiles, but it's wrong, and here's why." In a world where anyone can generate a plausible solution in seconds, the rare and valuable thing is the person who recognizes which plausible solution is actually a trap.
The machine writes faster than I do. That doesn't scare me. The only thing that scares me is the thought of a team that has stopped checking it.
The new job isn't writing code. It's knowing when the code is lying to you. If you read AI's suggestions with a raised eyebrow rather than a sigh of relief, we speak the same language. [See our open roles].