Rent the model, own the knowledge

Depend on the best AI you can — but keep the exit cheap. A dependency audit for operators.

Share

One morning this spring, Claude wouldn't let me in. Not the model — the model was fine. The login was broken. An authentication outage: the API kept answering for anyone already connected, but the front door, the OAuth handshake that Claude Code needs to start a session, was down. So I sat there with a working tool I couldn't reach, watching a spinner.

Our Slack lit up inside two minutes. Everyone hit the same wall at the same time. It cleared after a while, the way these things do, and we got back to work.

What stayed with me was something a data scientist said afterwards. He had a deadline — a real one, a client project with a fixed date — and for the length of that outage he was genuinely afraid he wouldn't be able to deliver. Not annoyed. Afraid. The work he had to do that day had quietly become work he could only do through one American company, and that morning the company wasn't reachable.

That fear is a number. It's the cost of a dependency we'd never put a price on. The outage just put the price on it for us.

This post is about that number — what it is, where it hides, and how to keep it small. It is not a post about escaping AI. I use Claude Code every day and I'm not stopping. It's about staying able to leave.

One word, two problems

In French we have one word, dépendance, and it carries two meanings English keeps apart: dependence — you can't operate without it — and addiction — you can't stop even when you should. They're different problems with different fixes, and lumping them under one word is how you end up managing neither.

This post is about the first one: the firm's structural dependence, the kind your procurement team would recognise if anyone had thought to write it down. The second one — the personal pull of the tool, the reflex, the skill that quietly rots — is a post of its own, and it's next.

You're already dependent, and that's fine

Let's get the reflexive objection out of the way, because a CEO who has lived through one too many IT-purity projects will raise it immediately: every business depends on suppliers it doesn't control. You depend on the grid. On SAP. On Microsoft. On a fab in Taiwan and a port you've never seen. Specialisation is how modern industry works, and refusing to depend on anyone is a good way to do everything worse and more expensively.

That's correct. Dependence isn't the risk. Dependence is just the price of not building everything yourself, and it's usually a good deal.

The risk is unexamined dependence — a reliance you carry on the books without ever asking the only question that matters: what would it cost us to leave, and how long would it take? You know this number for your ERP. You know it for your main supplier; it's half of why you keep a second one qualified. For AI, almost nobody has worked it out. The technology arrived faster than the risk management did.

So the goal isn't to depend on less. It's to know your exit cost — and keep it cheap.

The supplier you can't stockpile

Here's the first thing that makes AI an unusual supplier.

You can hold three months of steel. You can warehouse components, second-source a fastener, keep safety stock against a bad quarter. Industrial firms are good at this — turning a supply risk into an inventory line.

You cannot stockpile inference. There is no warehouse for it. Every token your business consumes is produced on demand, the moment you need it, by someone else's machines. It's a pure flow dependency: cut the pipe and you don't run down a buffer, you stop the same day. The data scientist with the deadline didn't have yesterday's inference saved up. When the door closed, the work stopped.

And it's not one pipe, it's three stacked on top of each other, each one someone else's:

  1. The model — Anthropic, OpenAI, Google.
  2. The cloud it runs on — AWS, Azure, GCP. Often the same handful of companies, one layer down.
  3. The solution you actually touch — Claude Code, ChatGPT, Copilot, or the AI now baked into a tool you already pay for.

Any one of the three failing takes you down, and you control none of them. On top of all three sits a legal layer most French operators underrate: under the U.S. CLOUD Act, a U.S.-headquartered provider can be compelled to produce data it holds regardless of which country the servers sit in. "Hosted in Europe" does not mean "outside American jurisdiction" when the company is American. That's not a reason to panic; it's a reason to know what you're sending.

The price is fake

Now the part your CFO should hear, because it's the one that looks like a bargain and isn't.

Today AI is cheap and getting cheaper. Per-token prices have fallen roughly ten-fold a year. It's tempting to read that as a maturing, competitive market settling into sustainable pricing. It isn't. It's a land-grab, funded by investors, and you are being subsidised to build the habit.

Be precise about where the subsidy is, because this is where lazy takes get torn apart. The raw per-token API call is actually gross-margin positive — OpenAI's leaked 2025 gross margin was around 48%, and AI-first software runs 20–60% gross against 70–90% for traditional SaaS (Where's Your Ed At). That positive margin is exactly why token prices keep dropping, and for now the competition between providers genuinely protects you.

The losses are somewhere else, and they're closer to you than the API. They're in the flat-rate plans you actually rely on. In January 2025 Sam Altman said OpenAI was "currently losing money" on its $200/month ChatGPT Pro plan because "people use it much more than we expected" (Fortune). GitHub Copilot's heavy users burn far more compute than their subscription covers; Replit watched its gross margin swing from +36% to −14% as its coding agent ate more tokens than the price allowed (Where's Your Ed At). And the companies are deeply unprofitable overall — OpenAI spent roughly $1.35 for every dollar it earned in 2025.

So the honest framing isn't "prices will spike tomorrow." Prices, per token, will probably keep falling. The framing is: watch your bill, not the price. Two things push it up even as the unit price drops. Agentic workflows — the Claude Code loop, the autonomous agent — consume ten to a hundred times the tokens a chat did, so your consumption outruns the price cut. And the high-ROI flat-rate plans are loss-leaders that cannot stay loss-leaders forever.

Here's the uncomfortable mechanics of it. The reason Claude Code is worth paying for is that the return is enormous — it does in an afternoon what used to take a week. That same enormous return is the provider's pricing power. Your cost of running it yourself is the floor on what they could charge. Your ROI is the ceiling. And that ceiling is high. The day the land-grab ends and the competition thins, the people you depend on will discover what you already know — that you'd happily pay several times today's price rather than give the tool up. The subsidy isn't generosity. It's how you get to the point where you'll pay.

The dependence you don't control

Two flavours of this exposure don't even show up on the contract you signed.

The dependence you never signed. You can depend on OpenAI without ever having an OpenAI account. Your CRM ships an AI feature; your ERP adds a copilot; the helpdesk tool, the document suite, the analytics dashboard all quietly wire a model provider in underneath. Now your work routes through that provider through a vendor you already trusted — no line item, no contract with the model company, no off-switch you control. It's the most invisible dependence on the org chart, and the first place to look when you do the audit, because nobody chose it on purpose.

The supplier you share with everyone. This isn't just your supplier — it's everyone's. The whole market leans on the same two or three model providers and the same three clouds. When that shared wall wobbles, you don't go down alone; half your industry goes down with you, and so does your supplier's supplier. On 18 November 2025 a botched database-permissions change at Cloudflare — not an attack, just a config file that doubled in size and propagated across the network — took down ChatGPT, Claude, X, Spotify and Uber together for about four hours (Cloudflare's own write-up). One company's Tuesday-morning mistake, a large slice of the internet dark at once. That's the monoculture. The efficiency that makes shared infrastructure cheap is the same property that makes its failures synchronised.

The audit

Here's the thing you can actually do, and it takes an hour.

List every place your work now routes through an AI provider — including the shadow ones embedded in tools you already pay for. For each, note three things: which layer it sits in, how long it would take you to switch away, and what you'd lose if it vanished tomorrow.

Here's mine, abbreviated, to show the shape:

Dependency Layer Time-to-exit What I'd lose
Claude Code Solution Days Workflow habits; little data
Anthropic API Model Weeks–months Prompt tuning bound to one model
AWS hosting Cloud Months Real, sticky migration cost
AI inside our CRM Shadow Unknown Even visibility into what's sent

Now look at the column I deliberately left off: where each company is headquartered. Every row is the same answer — the United States. That's not me cherry-picking; it's the monoculture from two sections ago, sitting inside my own stack. And headquarters is the field that matters, not hosting: under the CLOUD Act, "hosted in Europe" doesn't move an American company outside American jurisdiction. If your audit comes back all-US too, that's worth a board conversation on its own.

The exercise isn't to make any of these zero. It's to see, in one table, which dependencies are cheap to leave (fine — depend away) and which ones would take months and cost real money (those are the ones to actively manage, second-source, or wrap in an abstraction layer now, while it's calm). The two rows to circle are the Months one — high exit cost — and the Unknown shadow one, which usually means nobody's even measured it yet.

The hedges, honestly

Three moves keep the exit cheap. None is free, and one is oversold everywhere else, so let me be straight about each.

Stay reversible. Put a thin abstraction layer between your product and any model provider, so swapping Anthropic for OpenAI for Gemini is a config change, not a rewrite. Keep your prompts, your context, your evaluation sets in your own repository — not locked in a vendor's console. This is the single highest-leverage move, and it has a real cost: you write and maintain the layer, and you give up a little of each provider's proprietary convenience. Worth it for anything load-bearing.

Own the knowledge base. This is the spine of the whole post. The model is rented; your knowledge is not, unless you let it be. Your data, your domain context, your documented workflows, the institutional understanding of how your business actually works — keep all of it in a form you control and can carry to any model. If your competitive knowledge lives only inside one provider's fine-tune or one tool's memory, you don't own it, you're renting it back. Whoever owns the knowledge base owns the option to leave.

Keep open weights as a fallback — and don't oversell it. Open-weight models you can run yourself — Mistral, the open Llama and Qwen families — are a genuine sovereignty hedge: your data never leaves, the model can't be switched off under you. But be honest about the trade. They are materially behind the frontier (Anthropic, OpenAI, Google) on hard reasoning and complex agentic work. Where they earn their place is the simpler, high-volume, well-bounded task with good context engineering — classification, extraction, routine drafting — not your hardest problem. Treating Mistral as a drop-in replacement for Claude on your toughest work isn't a hedge; it's a downgrade you'll quietly regret. Treating it as the reliable workhorse for the bottom 70% of tasks is sound risk management.

The honest limits

I want to be straight about what this does and doesn't do.

Reversibility costs real money. Abstraction layers are code you maintain. Multi-provider testing doubles some of your evaluation work. Export discipline is a habit nobody enjoys. This is overhead you pay against a risk that may never fire — exactly like a second qualified supplier. It's insurance, and insurance has a premium.

On-prem is a quality trade, not a saving. Running your own open-weight model to "escape the cloud" usually buys you a worse model and a GPU bill, not a free lunch. Do it for the specific tasks and the specific data-sovereignty reasons where it pays — not as a reflex.

Some lock-in is rational, and chasing purity wastes money. Migrating off AWS to feel sovereign, when the real exposure is a shadow AI in your CRM, is sovereignty theatre. The audit exists precisely so you spend the hedging effort where the exit cost is genuinely high, and not everywhere. You cannot second-source everything, and you shouldn't try.

None of this protects you from the outage that's already happening. When the front door is locked, reversibility doesn't get you in faster. What it buys you is the next quarter, not the next hour. For the hour, the only hedge is people who can still do something without the tool — which is the subject of the next post.

What to do this week

You don't need a project for this. You need about an hour and the willingness to write the number down.

One hour — the audit. Build the table above for your own stack. List the obvious dependencies, then go hunting for the shadow ones inside tools you already pay for. Fill in time-to-exit and what you'd lose. The first time, the "Unknown" rows are the finding.

One decision — the load-bearing one. Pick the single dependency with the highest exit cost on the most important workflow. Decide one thing that makes leaving cheaper: an abstraction layer, an export routine, a second provider qualified. One. This quarter.

One habit — own the knowledge. Wherever your real knowledge is accumulating — prompts, context, domain data — make sure a copy lives somewhere you control and can carry elsewhere. If it only exists inside a vendor, you've started renting the one thing you should own.

Coda

The strange part of that morning wasn't the outage. Outages happen; everything breaks eventually. The strange part was the fear — that a capable, well-paid data scientist, faced with a locked door, felt for a moment that the work was simply impossible without the tool on the other side.

A firm can manage the supplier contract perfectly and never notice that. You can have the DPA signed, the cloud region in Europe, the invoice approved — and still not see that your people have quietly reorganised their working lives around a tool whose off-switch belongs to someone else. The structural dependence and the personal one are the same word in French for a reason. The firm's dependence is, in the end, mostly the sum of its people's.

So: rent the model — the best one you can, today, without apology. But own the knowledge, keep the exit cheap, and write the number down before the next outage writes it down for you.

That's the first meaning of dépendance. Next, the second one — the one that's harder to admit, because it's not about the company. It's about the pull I feel reaching for Claude Code on a task I could do in thirty seconds by hand.


If you run this audit, I'd genuinely like to know which row came back "Unknown" — that's usually the interesting one. Drop a comment or write back. And if it was useful, send it to the one person in your org who signed the AI invoices without doing the math.