Strategic Foresight · Briefing No. 01 Reliability & Recursion Framework

Crossing
the Threshold

A seven-year scenario map for artificial intelligence, 2026–2033.

Horizon 2026–2033 Prepared 02 Jul 2026 Method Two-variable scenario map Register Planning & positioning

If you want to guess what the next seven years of AI will look like, you really only have to answer two questions. Do AI agents get reliable enough that you'd trust them to do things on their own? And does AI start to make AI better? Nearly everything people argue about — which companies win, which jobs go, how fast any of it happens — turns out to hang on those two.

I'm not going to try to predict one future, because I don't think anyone can. What you can do is map the few futures we might actually end up in, work out the order they'll separate in, and notice the signs that tell you which one is coming. The near future is fairly easy to see. The far one forks hard. A map is useful mostly because it shows you where the roads split before you get there.


§ 00

What the map shows

  1. The next few years are easy to predict.

    Agents go from impressive demos to things you can actually rely on, at least for narrow tasks. A lot of thin software gets absorbed into bigger platforms. Companies settle on owning their data and swapping the model underneath. None of this needs a breakthrough. It's just the spread of things that already work.

  2. The models improve faster than companies can use them.

    New abilities show up in months. Actually putting them to work takes years. That gap — not how good the models are — is what decides how much changes soon. “AI is moving incredibly fast” and “nothing really feels different yet” are both true, and for the same reason.

  3. Two questions split the whole thing open.

    Whether agents get reliable enough to trust, and whether AI starts improving AI. Put those two together and you get four fairly distinct worlds. Only one of them makes the rest impossible to predict.

  4. Compute and power are the real moat, and they end up in few hands.

    Whoever can train the biggest models and buy the electricity to run them has an edge that compounds, and not many players can do it. But open models, which can't be taken back once released, keep a floor under everyone else. The interesting fights happen in the space between those two facts.

  5. Jobs change task by task, long before they disappear.

    Roles get hollowed out from the inside as pieces of them get automated. The first place you see it is entry-level hiring, not layoffs. What's left moves toward judgment, taste, and being the person who's accountable. The averages can hide a lot of concentrated pain.

  6. The far future is unpredictable for one specific reason.

    If AI starts making AI better, everything speeds up and the questions in this report stop being the right ones. The first sign won't be a benchmark score. It'll be in reliability curves, and in how close open models stay to the best ones. Watch those.

“AI is getting better faster than companies can figure out what to do with it. That's not a glitch. That's the whole story of the next few years.”

The starting point

§ 01

The two variables

A good map doesn't need many lines on it. Lots of things about AI are uncertain, but most of them depend on just two. Nail those two down and the number of futures you have to worry about drops from infinite to four.

The first question — can you trust it?

Can you trust an agent to finish a job without someone checking its work? This isn't really about how smart the model is. A model that's brilliant but wrong five times in a hundred is something you have to supervise. One that's wrong once in a thousand is something you can hand things to and walk away from. Those aren't two points on the same line. They're different in kind — the difference between a tool and a coworker. So the thing that matters most in the near term is an error rate, not an IQ.

The second question — does it improve itself?

Does AI make the job of building better AI go faster? Right now that work is done by human researchers using AI as a helper. But if that changes — if the models take over a real chunk of the research itself — then progress stops being a straight line and starts to compound. This is the one thing that makes predicting the long term almost pointless, because anything that compounds quickly outruns whatever map you drew in front of it.

Figure 1 · Framework Four futures from two questions
Recursion — does AI accelerate AI? ◀ Linear Compounding ▶ Reliability — can agents be trusted to act? Contained ▶ ◀ Autonomous The Long Diffusion Contained · Linear — augmentation on a 20–30 year clock. ~25% Capability Overhang Contained · Compounding — genius that can’t be trusted to act. ~20% The Agentic Buildout Autonomous · Linear — agents work; the frontier doesn’t run away. ~40% · base case The Compression Autonomous · Compounding — timelines collapse; the map dissolves. ~15% Weight Base case sits in the lower-left. The unpredictable world sits in the lower-right.
The logic of the framework, not data. The percentages are my read as of mid-2026; they'll move as the signs come in.

§ 02

The branching path

The two questions don't get answered at the same time. Reliability comes first — it's closer, and the signal is clearer. Whether AI improves AI comes later. So if you read the next seven years left to right, they look like a tree: one fork early, two forks late.

Figure 2 · Signature The future graph, 2026–2033
2026~2029~20312033 Today Agents cross trust threshold? no → yes → AI accelerates AI research? The Long Diffusion contained · linear 25% Capability Overhang contained · compounding 20% The Agentic Buildout autonomous · linear — base case 40% The Compression autonomous · compounding 15% I · Diffusion II · Divergence III · Resolution
Read it left to right. The first fork — can you trust agents to act? — gets settled sooner, and more clearly, than the second. Thicker lines are the paths I think are more likely. It's the shape that matters, not the exact dates.

§ 03

Three horizons

The seven years aren't all the same. They come in three parts: one you can mostly see coming, one where the paths pull apart, and one where you finally know which path you're on.

I. Things spread and settle  ·  2026–2028

This part is easy to predict, because it's mostly the spread of things that already exist. Agents get more reliable at narrow, valuable tasks. The thin kind of software — basically a nice screen on top of a database — gets swallowed as its job becomes a feature of something bigger. Companies settle on owning their data and swapping the model beneath it. And quietly, the thing that limits progress stops being how good the models are and starts being how much power and how many chips you can get.

II. The paths pull apart  ·  2029–2031

Now the fork gets visible. Whichever way reliability goes, and then whichever way the AI-improves-AI question goes, the four worlds start to separate in ways you can actually measure instead of just argue about. The job effects show up first in hiring numbers for entry-level work, not in unemployment headlines. This is the stretch where the map starts to pay off, because the signs you were watching finally point somewhere.

III. You know where you are  ·  2032–2033

By now you can tell which world you're in. The reliability question has an answer. The AI-improving-AI question at least has a direction. The bets you'd been hedging across all four scenarios can finally be placed. What used to be foresight is just the situation you're in.

Figure 3 · Capability The open–frontier gap holds even as both rise
’26’27’28’29’30’31’32’33 capability index → Frontier (closed) Best open-weight ≈ constant gap
The logic, not real benchmarks. Open models pull the floor up for everyone, but a steady gap keeps the best models worth paying for. If that gap ever starts closing, it's one of the first signs the whole thing is turning into a commodity.

“The question was never whether AI gets smart enough. It's whether it gets reliable enough that you'd let it do something without checking.”

On what actually matters near term
Figure 4 · Reliability The trust threshold and the fan beyond it
’26’28’30’32’33 unsupervised task success → trust threshold — “safe to delegate” crossing ≈ ’29–’30? broad delegation stalls just below narrow only
Whether this curve crosses the “safe to hand it off” line, and when, is the biggest open question of the next few years — and the first one you'll get a real answer to.
Figure 5 · Diffusion Two clocks: capability runs ahead of adoption
’26’28’30’32’33 deployment overhang capability real-economy adoption
It happened with electricity too. The motors showed up decades before the productivity did, because you had to rebuild the factories around them first. It's the organizations, not the models, that set the pace.

§ 04

Four worlds

Each of the four is a world that makes sense on its own — its own way of working, its own winners, its own story about jobs, and its own tell. I've put them in order from the slowest to the most disruptive.

Contained · Linear

The Long Diffusion

~25%  ·  what the skeptics expect

Agents stay useful but you still have to watch them, and the models keep getting better in a straight line. AI turns out to be a big deal on the timescale of electricity or the internet — the kind of thing that changes everything over twenty or thirty years, not seven. The effect is real, but it soaks in slowly, one rebuilt workflow at a time.

How it works
AI helps people; it doesn't replace them. Someone's always checking.
Who wins
The big incumbents, by bundling it into what you already use.
Jobs
Change slowly enough that retraining roughly keeps up.
The tell
Reliability flattens out below the trust line and stays there.
Autonomous · Linear  —  base case

The Agentic Buildout

~40%  ·  my best guess

Agents get reliable enough that you can hand them real work, but the models don't suddenly run away from us. This is the world where jobs change the most, and where there's the most room for new AI-native software — and it's still a world you can understand and steer. A big, fast change, but one that makes sense as it happens.

How it works
Agents take over tasks, not whole jobs. They become coworkers.
Who wins
Whoever owns the data layer and ties the pieces together.
Jobs
Entry-level work erodes first; value shifts to judgment and taste.
The tell
Reliability crosses about 99% on work that actually matters.
Contained · Compounding

Capability Overhang

~20%  ·  the unstable one

The models get a lot smarter, fast — but you still can't trust them to act on their own. They ace every test while barely getting used for anything real. There's this huge pile of ability sitting just out of reach behind the trust problem, and the gap between what's possible and what's actually safe to use becomes the whole story. Pressure builds and nothing lets it out.

How it works
Smarts outrun trust. Checking the work becomes the scarce thing.
Who wins
Whoever sells trust: testing, guardrails, human oversight.
Jobs
Lopsided — AI as a helper booms, AI acting on its own stalls.
The tell
Test scores keep climbing while real usage stays flat.
Autonomous · Compounding

The Compression

~15%  ·  where the map stops working

Reliable agents show up at the same time AI starts improving AI, and the two feed each other. Everything speeds up. The questions this whole report is built around — who wins which market, how jobs change — stop being the right questions, because what software and work even are is shifting underneath them. It's the least likely branch, and the one that matters most if it happens.

How it works
AI makes AI better, and the loop keeps tightening on itself.
Who wins
Whoever holds the compute and the power at the frontier.
Jobs
Change in jumps. The usual forecasts stop meaning much.
The tell
Labs report big jumps in how much AI speeds up their own research.
Figure 6 · Divergence One metric, four paths
0%20%40%60% ’26’28’30’32’33 share of knowledge-work tasks executable unsupervised → Compression Agentic Overhang Long Diffusion
Don't read the exact numbers; the point is how far apart the lines get. In the Overhang world the ability is there but stuck below the trust line, so the share you can actually use flattens out even while the models keep getting smarter.

§ 05

What holds across all four

A few things stay true no matter which branch you end up on. Think of them as the load-bearing walls — they hold up whichever room you're standing in.

Compute and power end up in few hands

Training the biggest models, and getting the electricity to run them, is the one real moat — and it belongs to a handful of players almost by physics, not just strategy. This is the strongest reason to worry about everything getting concentrated. It's not that one company out-designs everyone else. It's that the raw materials of frontier AI naturally sit in few hands. In every one of the four worlds, the power grid and the price of a gigawatt matter more than any clever product decision.

Open models keep a floor under everyone

Once a model's weights are out, you can't take them back. Open models stay maybe a generation behind the best ones, but they're good enough for a huge share of what people actually need — and just by existing they drag the price of the commodity parts toward zero and keep buyers from getting locked to one vendor. This is the thing pushing against a winner-take-all ending, and the reason there's a real fight in the middle at all.

The value moves from the app to the data

As the models turn into a commodity, the thin app layer stops being defensible and the value drains toward the parts that don't: your own data, the retrieval that respects who's allowed to see what, the orchestration, the evaluation, and the people accountable for all of it. “Own your data, rent the model” isn't a slogan. It's just where the money ends up.

Figure 7 · Structure Where enterprise value pools, over time
Bundled apps Model / inference Data plane & governance Judgment & accountability 202620292033 0share of captured value →
Drawn for the base case. The direction — away from thin apps and toward data, governance, and people who are accountable — holds in every scenario. Only the speed changes.

“Own your data. Rent the model. The companies that insist on owning both usually end up wishing they'd done neither.”

On what companies should actually do

§ 06

The four worlds, side by side

WorldHow it worksWho winsJobsOddsThe tell
Long Diffusion AI helps people over 20–30 years Big incumbents, by bundling Change slowly; retraining keeps up ~25% Reliability flattens below the line
Agentic Buildout Agents take over tasks, act as coworkers Whoever owns the data layer Entry-level erodes first ~40% Reliability crosses ~99% on real work
Capability Overhang Smarts outrun trust Whoever sells trust and oversight Lopsided; acting-alone stalls ~20% Test scores rise, real use stays flat
The Compression AI makes AI better; the loop tightens Holders of compute and power Change in jumps; forecasts break ~15% Labs report big research jumps
§ 07

Instruments, not headlines

You won't figure out which world is coming from product launches or benchmark records. You'll figure it out from a short list of signs, most of them kind of boring. These are the ones worth watching.

What to watchWhy it mattersPoints toward
How reliable agents are on real work Crossing about 99% without supervision is the hinge of the whole map The trust-it worlds
How close open models stay to the best If the gap closes, it's all turning into a commodity; if it holds, the best stay worth paying for Who ends up with power
How much AI speeds up AI research A reported jump here is the first fingerprint of AI improving AI The fast worlds
Entry-level hiring in exposed roles Job effects show up here before they ever reach unemployment numbers When jobs feel it
Data-center power waits and prices Whether electricity, not algorithms, becomes the thing that holds progress back The ceiling on the buildout
How companies set up their AI Own-your-data-and-swap-the-model vs. just taking the incumbent's bundle Merge-into-few vs. real contest

§ 08

The bottom line

Here's the whole thing in a sentence. The next three years are mostly things you can see coming — stuff spreading, companies merging. Years five through seven split on two questions: whether agents get trustworthy enough to act, and whether AI starts improving AI. And it's that second one that makes predicting the long term almost hopeless.

My best guess is the Agentic Buildout: a big, fast change you can still make sense of, where the value slides away from thin software toward data, orchestration, and human accountability, and where jobs change task by task well before they change wholesale. The two extremes bound the range — a Long Diffusion that just takes decades, and a Compression that squeezes it all into a few years. Which one you're heading for gets written down early and quietly, in reliability curves and in how close open models stay to the best ones, long before it ever reaches the news.

So plan for the middle, keep some room for the extremes, and watch the signs, not the headlines.

Cited anchors.
Enterprise RAG market ≈ US$1.9B (2025) → US$9.9B (2030), ~38% CAGR — MarketsandMarkets, Nov 2025.
≈95% of enterprise generative-AI pilots reach no measurable P&L impact — MIT “GenAI Divide” study, 2025.
Category-leader enterprise-search platform: ≈US$200M ARR, ≈US$7.2B valuation, early 2026 — company & analyst reporting.

Crossing the Threshold · Strategic Foresight Briefing No. 01 · Horizon 2026–2033 · Prepared 02 July 2026.