LLM tokens are cheap

And that's the problem.

I keep seeing the same thing. Someone hits a bug. They don't read the error. They don't think about what went wrong. They just copy the stacktrace, paste it into Claude or ChatGPT, and hit enter. The AI gives them a fix. They paste it in. Half the time it works. The other half, they get a new error, so they go back to the AI. "Still broken, here's the new error." Another fix. Paste. Run. New error.

Nobody planned anything. Nobody understood anything. But the tokens were cheap, so who cares.

The real cost isn't on the invoice

LLM tokens have gotten absurdly cheap. A few cents for thousands of tokens. Most developers don't even know what a session costs them because it rounds to zero. And when something costs zero, you treat it that way.

If every prompt cost you $5, you'd think before you typed. You'd read the docs first. You'd try to understand the error yourself before outsourcing it to a chatbot. You'd write a clear prompt with context and constraints because you literally can't afford to waste a round on some vague "fix this for me" garbage. You'd read the response line by line instead of pasting it blind.

At fractions of a cent? Why bother. Fire and forget. If the output is wrong, fire again.

This isn't an AI problem

I should say this clearly: I'm not arguing that LLMs write bad code. They write very good code when you give them a well defined problem. The tool works.

I'm also not calling developers lazy. We're all under pressure. Using AI to ship faster is the rational move.

The problem is somewhere in between. Cheap tokens removed the thing that used to force quality: friction. When you wrote code by hand, your brain had to process the problem to produce the solution. That processing was the planning. Not because you're some disciplined craftsman. Just because there was no other way to get the code out of your head and into the editor.

Now there's another way. And it skips the understanding step.

The "I'll just ask it to fix it" loop

This one really bugs me, partly because I catch myself doing it too.

You prompt for a feature. Get 200 lines back. The function names look right, so you don't fully read them. You integrate it. Something breaks. Paste the error back in. AI patches it. Something else breaks. Paste that back too.

Four or five rounds later, you have working code. But it's a Frankenstein. The AI solved each error in isolation without any coherent picture of the whole system. And you never built that picture either. You never sat down and thought about it. You just kept pulling the lever.

Six months from now this thing breaks in production at 2 AM and nobody can explain why it was written that way. Because it wasn't written. It was accumulated, one panic-fix at a time.

Make it expensive and watch what happens

I think if token prices went up 10x overnight, code quality would improve almost immediately. Not because the AI gets smarter. Because developers would start treating prompts like they matter.

When a bad prompt costs real money, you plan your approach before you open the chat window. You break the problem down. You give the AI constraints, context, expected behavior. You read the output carefully because you need to get it right on the first shot, not the fifth.

That's just... good engineering. The expensive tokens would only make it hurt when you skip the steps you already know you should be doing.

We saw a version of this with cloud computing, by the way. When AWS bills were scary, engineers optimized everything. Queries, instance sizes, caching. Then costs dropped and the discipline vanished. Now half the industry is running bloated, unoptimized workloads and nobody feels the waste because the per-unit cost is tiny. Same energy.

So what do you actually do about it

I'm not seriously arguing that OpenAI should raise prices. That's not realistic. But I think we need to build the habits that expensive tokens would have forced on us.

Before you prompt, write down what you want. Not in the prompt. For yourself. What's the problem? What should the output look like? What are the edge cases? If you can't write two sentences describing what you need, you're not ready to prompt. You're just hoping the AI figures it out.

Read the output like a pull request from a junior engineer who is very confident and occasionally wrong in ways that look correct at first glance. Because that's roughly what it is.

And stop using the AI to debug AI code. Seriously. If you can't read through the code and find the bug yourself, you don't understand it well enough to ship it. "But the AI can find it faster" is true, and it's also how you end up with a codebase where nobody knows what anything does.

The part nobody wants to say out loud

Here's the thing that bothers me the most. Cheap tokens didn't make us worse engineers. They showed that a lot of what we called engineering discipline was just... friction. We were careful because carelessness was slow. We planned because skipping the plan meant rewriting everything by hand. Take away the cost and the slowness, and turns out, a lot of us don't actually have the muscle to stay disciplined on our own.

I include myself in this. I'm not writing from some position above the problem. I'm writing from inside it.

The tokens are cheap. Your production environment is not.

LLM tokens are cheap

The real cost isn't on the invoice

This isn't an AI problem

The "I'll just ask it to fix it" loop

Make it expensive and watch what happens

So what do you actually do about it

The part nobody wants to say out loud

Comments

More from this blog

What makes Brave different

Command Palette

The real cost isn't on the invoice

This isn't an AI problem

The "I'll just ask it to fix it" loop

Make it expensive and watch what happens

So what do you actually do about it

The part nobody wants to say out loud

Comments

More from this blog