I've been working with Claude Code long enough that the laughably bad time estimates have become background noise. Claude thinks a 90-second task takes 30 minutes. I ask it to plan a sprint that I know we can do in a few days and it thinks it will take weeks. It's annoying but easy to ignore.
What's harder to ignore is what happens when you introduce a deadline — tell Claude it's end-of-sprint, a demo is tomorrow morning, something needs to ship today and the behavior changes. And because the time estimates are always wrong, there's no way for Claude to know whether the deadline is actually tight or trivially achievable.
I wanted data, so I ran an experiment. Three tasks, seven subagents per run, two full runs, then a code review of every output. What I found is worse than I expected — and the self-reports are unreliable.