Model
Generator-verifier loop with a per-task autonomy slider. The per-task value function V(u, v; θ) decomposes into four orthogonal channels (quality, attention, risk, skill); V is exactly bilinear in (u, v) so per-task optima land at three corners — do-yourself, self-automator, spec-driven. Centaur and cyborg arise as aggregate-level patterns from cross-sub-task corner mixing. Portfolio aggregation under a daily attention budget surfaces the Lagrangian shadow price μ that reroutes longer tasks first when budget binds. Five cruxes, six Stage-4 fitting targets, three engaged objections. Interactive two-tab dashboard included.
TLDR
The lit review documents a research landscape; the topology stripped it down to load-bearing structure. This stage formalises the cleanest target the topology surfaced: a generator-verifier loop with a per-task autonomy slider, designed to survive capability change rather than encode a snapshot of any specific model’s frontier. The optimisation target is output quality per unit of human attention for an individual knowledge worker. The formalisation makes three moves at once — decomposition (four orthogonal value channels: quality, attention, risk, skill), generating function (a per-task value function whose corner solutions reproduce the five empirically observed workflow modes), and integration (a single formalism that composes Karpathy’s slider, Mollick’s typology, Vasconcelos verification-economics, Bastani’s atrophy, Bainbridge’s substitution myth, and Madras-Mozannar L2D into one object).
The per-task value function is V(u, v; θ) = Q(u,v) − α·A(u,v) + λ·S(u,v) − σ·R(u,v), where u is the autonomy level (fraction of the task delegated to AI) and v is the verification depth (fraction of AI output independently checked). The four channels are conceptually distinct mechanisms: quality Q rewards letting the better agent do the work, with verified output achieving the complementary-product ceiling c_⋆ = c_AI + (1 − c_AI)·c_H (either AI was right or human catches the error); attention A charges for human time and for the irreducible monitoring cost even at full delegation (the L1 substitution-myth invariant baked in via ε > 0); risk R penalises uncaught AI errors at a rate proportional to stakes σ; skill S rewards practice and penalises unverified delegation (Bastani-style atrophy). V turns out to be exactly bilinear in (u, v) — collecting terms gives V = K_0 + K_u·u + K_v·v + K_uv·u·v — which means the maximum on the unit square is always at a corner. Three corners are candidates: (0, 0) do-yourself, (1, 0) self-automator, (1, 1) spec-driven (the corner (0, 1) is dominated since verifying with no AI involvement is pure cost). Centaur and cyborg modes do not arise as per-task optima — they appear only as aggregate-level patterns when a worker mixes corner policies across sub-tasks with heterogeneous θ. This is a substantive prediction, not a limitation.
The portfolio extension aggregates per-task decisions under a daily attention budget. The headline result the model is designed to produce is S1 (workflow architecture > model capability): on the same task mix and same c_AI, optimal routing dominates “max-AI” (self-automate everything) and dominates the naive flat-cyborg heuristic (u=0.7, v=0.3 everywhere). The bilinearity finding sharpens this: the naive flat-cyborg policy is exactly the failure mode — it applies an interior (u, v) value that the bilinear structure says no individual sub-task should land at. Optimal routing differentiates across tasks (different corners for different θ); the aggregate (ū, v̄) across the day looks interior because the corners differ, not because any single decision is interior. The generating function is parameterised by capability (the L3 invariant): if c_AI rises uniformly across tasks, the model rebalances; if it rises only on certain task types, the boundary of optimal u* shifts but the shape of the routing rule stays put.
This stage produces seven things: (1) a math object — V(u, v; θ) and the four-channel decomposition; (2) a workflow-mode classifier — three-corner per-task router plus a five-region label-partition of the (u, v) plane for observed worker behaviour; (3) a portfolio aggregator with budget-aware shadow-price routing (μ-binary-search over per-task α_eff = α + μ·g); (4) the interactive two-tab dashboard below; (5) five cruxes of the model (load-bearing claims whose collapse rebuilds it); (6) six Stage-4 fitting targets (parameter calibrations and qualitative predictions the data pipeline should test); (7) three engaged objections (c_AI unobservable, model just recovers practitioner intuitions, model is single-shot not strategic) with steelmen and what survives. Scope is explicit: the formalisation does not capture the aggregate-zero puzzle (E4/O2 — organisational dynamics), cross-task productivity bundling (G8/Cowen), c_AI miscalibration on novel tasks (O7), sycophancy as a verification-degrader (E13), or frontier migration over time (O4). These are named scope-limits, not silent assumptions.
Task parameters (θ)
AI mildly stronger, modestly cheap verification, moderate stakes. Optimum lands at spec-driven — AI does the synthesis, you read the output.
Optimal policy
Channel decomposition
Each bar shows that channel's contribution to V at the optimum. Q is gain over the human-only baseline; A is attention saved (or spent) vs. the M-only floor; R is the stakes-weighted risk penalty; S is the skill change weighted by λ.
Diagnostics
V(u, v; θ) = Q(u,v) − α·A(u,v) + λ·S(u,v) − σ·R(u,v). Constants: α = 1.00 (normalised), ε = 0.15 (residual attention at u=1; L1 invariant), β = 0.05 (per-task atrophy rate), M = 0.08 (routing tax). Mode thresholds: u_lo = 0.15, u_hi = 0.85, v_lo = 0.3, v_hi = 0.6. Optimum found by 41×41 grid search on the unit square. Stage-4 fitting will tighten α, ε, β, M against telemetry data; mode-distribution match against Randazzo BCG sample (~60% cyborg / ~30% centaur / ~10% self-automator) is Q3 of the named fitting targets.
How to read this stage
The dashboard above is the artifact. Everything below is the spec.
Two interactive surfaces:
-
Per-task router. Inputs:
(c_H, c_AI, φ, σ, λ)for one task. Outputs: optimal(u*, v*), the workflow-mode label, and a four-bar decomposition showing which channel dominates. Use this to answer “for a task with these characteristics, what’s the right way to use AI?” -
Day portfolio. Inputs: a basket of task types with counts. Outputs: total quality, total attention, total skill change under four strategies (always-self / max-AI / naive cyborg / optimal routing). Use this to see the S1 effect — same AI, different workflow architectures, very different outcomes.
If the dashboard says one thing and your gut says another, the diagnostic is to check (a) whether your (c_H, c_AI) estimates are calibrated and (b) whether the constants (α, ε, β, M) are calibrated for your task density. This is exactly what Stage 4 (the data pipeline) is for.
1. The formalisation moves
Three things this stage does — explicit so they’re inspectable separately.
Move 1 — Decomposition. The per-task value V splits into four orthogonal channels (Q, A, R, S). “Orthogonal” here means: each channel responds to (u, v) in a distinguishable way, so when a slider moves, the user can see which channel is driving the change. This isn’t a stylised choice — it’s how the topology’s mechanism nodes (G3, G7, G9, L1) actually compose. Without the decomposition, “the gain from AI” is a black box; with it, the user can ask “is this gain coming from quality, time-saved, or skill?” and answer.
Move 2 — Generating function. The five workflow modes (P1 / E10) are not given as a typology to be matched. They are generated as solutions to argmax V(u, v; θ) under different parameter regimes. This is the L3 invariant in operational form: change θ, the optimum moves, the mode label changes — but the function that produces the mode is fixed. A practitioner who hardcodes “use Cursor for boilerplate, ChatGPT for strategy” gets a lookup table; this gets a generating function.
Move 3 — Integration. Six prior objects compose into one: Karpathy’s autonomy slider (P2 — the u axis), Shneiderman’s 2D framework (L4 — the (autonomy, control) plane is the (u, v) plane), Vasconcelos verification-economics (G3 — the −α·v·φ term), Bastani’s guardrail finding (E9 — the −β·u·(1-v) skill term), Bainbridge’s substitution myth (L1 — the ε > 0 residual attention), and Madras-Mozannar L2D (L2 — the joint-surface optimisation at portfolio level). None of these alone is the model; the model is what they jointly imply.
What’s not yet ready for formalisation, kept in §9 as scope-limits: cross-task bundling (G8), organisational absorption (E4/O2), miscalibration of c_AI (O7), sycophancy as quality-degrader (E13), frontier migration over time (O4).
2. Variables and objects
Decision variables (per task), continuous on the unit square:
| Symbol | Range | Meaning |
|---|---|---|
u | [0, 1] | Autonomy level — fraction of generation delegated to AI |
v | [0, 1] | Verification depth — fraction of AI output independently checked |
Task parameters (the vector θ):
| Symbol | Range | Meaning |
|---|---|---|
c_H | [0, 1] | Human capability on this task type |
c_AI | [0, 1] | AI capability on this task type |
φ | ≥ 0 | Verification-cost ratio (verify-time / generate-time) |
σ | [0, 1] | Stakes — weight on uncaught-error penalty |
λ | [0, 1] | Skill-formation value — how much the worker cares about preserving this skill |
Constants (calibrated, not per-task):
| Symbol | Default | Meaning |
|---|---|---|
α | 1.00 | Attention price (normalisation) |
ε | 0.15 | Residual attention at full delegation — L1 invariant (substitution myth) |
β | 0.05 | Skill-atrophy rate per unit of unverified delegation |
M | 0.08 | Per-task metacognitive routing tax — A1/G1 invariant |
c_⋆ | c_AI + (1−c_AI)·c_H | Verified-output ceiling — “either AI got it right OR human catches the error” (complementary product of independent error events) |
The constants are calibrated to the lit-review anchors (Mozannar CUPS for ε; Bastani 17% drop for β; Tankelevitch metacognitive load for M). They can be re-fit by Stage 4 against telemetry data.
3. The per-task value function
V(u, v; θ) = Q(u, v) − α·A(u, v) + λ·S(u, v) − σ·R(u, v)
3.1 Quality channel Q
Q(u, v) = (1 − u)·c_H + u·[(1 − v)·c_AI + v·c_⋆]
With probability (1 − u) the human did the generation and quality is c_H. With probability u the AI did the generation, of which fraction (1 − v) ships unverified at quality c_AI and fraction v is verified. Verified output achieves quality c_⋆ = c_AI + (1 − c_AI)·c_H = 1 − (1 − c_H)·(1 − c_AI) — the probability that either the AI got it right or (it didn’t and) the human catches the error, treating the two error events as independent. Linear in u for fixed v; linear in v for fixed u.
The complementary-product form natively handles the deskilled-verifier limit (c_H → 0 ⇒ c_⋆ → c_AI — verification adds nothing when the human can’t recognise errors) and the verifier-stronger limit (c_H → 1 ⇒ c_⋆ → 1 — careful human verification approaches a quality ceiling). Pass 1 used c_⋆ = max(c_H, c_AI), which over-credits verification when c_H > c_AI (as if the human catches every AI error) and under-credits it when c_H < c_AI (as if a partly-skilled human catches no AI errors). The form here treats both with one expression and one structural assumption (independence of human and AI error events).
Caveat: c_H enters both the generation cost (when u = 0) and the verification benefit (the extra catching power at v > 0). Karpathy’s G9 generator-verifier-asymmetry says verification is typically easier than generation — recognising an error costs less than producing a correct answer from scratch. A more accurate model would carry a separate c_V (verifier capability) ≥ c_H for low-c_H workers. Held for Stage 4; named in §9 scope-limits.
3.2 Attention channel A
A(u, v) = (1 − u·(1 − ε)) + v·φ + M
Three pieces:
(1 − u·(1 − ε)): human-side generation cost. Atu = 0, the human does it all (cost = 1, the base generation time). Atu = 1, residual attentionεremains — every offload creates monitoring/coordination work (Bainbridge L1).ε > 0is the L1 invariant in operational form: a model withε = 0would predict that full delegation is free of attention cost, which is exactly the substitution myth.v·φ: verification cost. Linear in verification depth, scaled by the per-task verification-cost ratioφ. This is the G3 (Vasconcelos verification-economics) term.M: per-task metacognitive routing tax. Constant per task — classifying, choosing the workflow, monitoring AI for handoffs. Tankelevitch’s metacognitive-demand finding (G1) compressed to a constant; Stage 4 can test whetherMis task-type-dependent.
Note that ε is the only term that decouples attention from the (u, v) decision. It is what makes “max-AI” not free.
3.3 Risk channel R
R(u, v) = u·(1 − v)·(1 − c_AI)
Probability-of-uncaught-error: AI generated the output (u), it was not verified (1 − v), and the AI was wrong (1 − c_AI). Multiplied by stakes σ in the value function. This is the G2 (ironies-of-automation) term: rare critical errors get missed precisely when delegation is high and verification is low.
For high-σ tasks, this term is large enough to drive v → 1 (the spec-driven / independent-then-synthesize regime, E8 — Everett 2025). For low-σ tasks, it’s negligible and the optimum can sit at v = 0 without harm.
3.4 Skill channel S
S(u, v) = (1 − u) − β·u·(1 − v)
Two terms:
(1 − u): practice — the human builds skill on the fraction of the work they did themselves.−β·u·(1 − v): atrophy — unverified delegation erodes capacity. Verification preserves engagement (this is Bastani’s “hint mode” finding E9: guardrails → no atrophy). The productu·(1 − v)is exactly the self-automator regime where atrophy is fastest.
λ is the worker’s per-task valuation of preserving the skill. For tasks the worker explicitly wants to maintain capacity on (their core craft), λ is high and the skill term has bite. For boilerplate or one-off tasks, λ is low and skill is correctly ignored.
3.5 Putting it together
V(u, v; θ) = (1 − u)·c_H + u·[(1 − v)·c_AI + v·c_⋆] ← Q
− α·[(1 − u·(1 − ε)) + v·φ + M] ← −α·A
+ λ·[(1 − u) − β·u·(1 − v)] ← +λ·S
− σ·[u·(1 − v)·(1 − c_AI)] ← −σ·R
Five parameters in θ, two decisions, four channels. Exactly bilinear in (u, v): collecting terms,
V(u, v; θ) = K_0 + K_u·u + K_v·v + K_uv·u·v
with
K_0 = c_H − α − α·M + λK_u = (c_AI − c_H) + α·(1 − ε) − λ·(1 + β) − σ·(1 − c_AI)K_v = −α·φ(K_v captures the cost of verification when there is no AI to verify, i.e., atu = 0; pure cost, hence ≤ 0 — this is exactly why corner(0, 1)is dominated by(0, 0)below)K_uv = (c_⋆ − c_AI) + λ·β + σ·(1 − c_AI) = (1 − c_AI)·c_H + λ·β + σ·(1 − c_AI)(always ≥ 0 — verification gain is monotone inu)
Bilinear functions on a unit square attain their maximum at a corner. The interior critical point, when it exists, is a saddle (Hessian eigenvalues ±K_uv). So the per-task optimum is at one of the four corners — and since K_v < 0, (0, 1) is dominated by (0, 0) (verifying when no AI is involved is pure cost). The three meaningful corners:
(0, 0)— do-yourself.(1, 0)— full delegation, no verification (self-automator).(1, 1)— full delegation with full verification (spec-driven / independent-then-synthesize).
Which corner wins depends on the signs of K_u, K_u + K_uv + K_v, and K_v + K_uv — three linear comparisons over θ.
4. Optimal policy: the three corners that win
V is bilinear → max at a corner. Of the four corners, (0, 1) is dominated by (0, 0) because K_v < 0 (verifying with no AI involvement is pure cost). Three candidates remain — and a clean decision tree determines the winner:
- Spec-driven
(1, 1)wins iffK_u + K_v + K_uv > 0andK_v + K_uv > 0. - Else self-automator
(1, 0)wins iffK_u > 0. - Else do-yourself
(0, 0)wins.
Equivalently: spec-driven dominates self-automator when α·φ < K_uv (verification cost is below benefit at full delegation: α·φ < (1 − c_AI)·c_H + λ·β + σ·(1 − c_AI)); self-automator dominates do-yourself when K_u > 0 (the attention savings + AI quality gain outweigh skill loss + stakes risk).
Comparative statics — what moves the corner choice:
| Parameter increase | Effect on u* | Effect on v* (given u* = 1) |
|---|---|---|
c_AI − c_H ↑ via c_AI rising (c_H fixed) | ↑ (K_u rises) | ↓ (K_uv falls: less verification benefit) |
φ ↑ (verification expensive) | weakly ↓ via v*-flip | ↓ (K_v more negative) |
σ ↑ (stakes) | weakly ↓ (K_u falls by σ·(1−c_AI)) | ↑ (K_uv rises by σ·(1−c_AI)) |
λ ↑ (skill matters) | weakly ↓ (K_u falls by λ·(1+β)) | ↑ slightly (K_uv rises by λ·β) |
c_AI ↑ alone | ↑ | ↓ (both (1−c_AI)·c_H and σ·(1−c_AI) fall) |
This is the L3 invariant in tabular form: if c_AI rises uniformly, optimal u* rises and v* falls. The structural shape of the rule does not change. Two signs worth flagging because they’re non-obvious: more reliable AI (c_AI ↑) reduces verification benefit (fewer errors to catch); higher stakes (σ ↑) pushes u* down (refuse AI when stakes are high) and v* up (if you do use AI, verify carefully) — a real tension the bilinear structure makes explicit.
The five practitioner modes — labels for the (u, v) plane
The practitioner literature names five workflow modes. They span the (u, v) plane and are useful as vocabulary for labelling observed worker behaviour at any (u, v):
| Region | Mode | Practitioner anchor |
|---|---|---|
u ≈ 0 | Do-yourself | (no-AI / refuse-AI) |
u ∈ (0, 1), v high | Centaur | Mollick — clean handoff with verification gate |
u ∈ (0, 1), v low/mid | Cyborg | Mollick — interleaved, partial verification |
u ≈ 1, v high | Spec-driven / independent-then-synthesize | Everett 2025; Compound Engineering |
u ≈ 1, v low | Self-automator | Randazzo HBS 26-036 (the trap) |
The per-task router only returns the three corners — (0, 0), (1, 0), (1, 1) — corresponding to do-yourself, self-automator, and spec-driven. Centaur and cyborg do not arise as per-task optima. They appear empirically as aggregate-level labels when a worker mixes corner policies across sub-tasks with heterogeneous θ — some sub-tasks done alone, others fully delegated; some AI outputs verified, others shipped. The day-level (ū, v̄) averages out to interior values that get labelled “cyborg” or “centaur” depending on the ratio. The Day Portfolio tab demonstrates this directly.
This is a substantive prediction of the formalisation, not a limitation. Bilinearity is faithful to the structure of the problem: V collects to one bilinear form V = K_0 + K_u·u + K_v·v + K_uv·u·v, even though three of the four channels (Q, R, S) carry their own u·v terms — they sum to one consolidated K_uv. The model says: at any single sub-task with a single θ, pick a corner — fully delegate or don’t, fully verify or don’t. The naive flat-cyborg strategy (u = 0.7, v = 0.3 applied to every task) is exactly the failure mode — applying an interior policy uniformly is what no individual sub-task should do under bilinearity.
5. Portfolio aggregation — the day
A worker faces N tasks per day, each with its own θ_i. Total attention budget T. The portfolio problem:
maximise Σ_i Q_i(u_i, v_i) · count_i
subject to Σ_i A_i(u_i, v_i) · g_i · count_i ≤ T
where g_i is the task type’s base generation time. The Lagrangian gives a shadow price μ ≥ 0 on the budget constraint, and the per-task choice rule becomes
argmax V_i − μ·A_i·g_i = argmax V_i with α replaced by α_eff_i = α + μ·g_i
— that is, budget pressure raises the effective attention price, more so for longer-base-time tasks. When μ = 0 the budget is slack and per-task choice is unconstrained argmax V; when μ > 0 the budget binds and α_eff rises until total absolute attention Σ A_i·g_i·count_i fits T. The biasing is structural: raising α_eff raises K_u (self-automator becomes more attractive vs. do-yourself) and makes K_v = −α_eff·φ more negative (verification becomes less attractive). So as the budget tightens, longer tasks reroute first from spec-driven (1, 1) to self-automator (1, 0) — the lowest-A corner.
This is the L2 invariant operationalised at portfolio level. The naive practitioner rule “use AI when AI is better” compares c_H to c_AI task-by-task in isolation; the joint-surface rule compares marginal Q per unit attention saved against the day’s shadow price. They give the same answer when attention is abundant (μ ≈ 0); they diverge sharply when the day is tight.
Strategies the dashboard compares:
- Always-self.
u_i = 0for all i. No AI, no atrophy, no verification cost — but no productivity gain. The pre-AI baseline. - Max-AI.
u_i = 1, v_i = 0for all i. The corner uniformly applied. Fast, but high risk on high-σ tasks and accelerated atrophy on high-λ tasks. - Naive cyborg.
u_i = 0.7, v_i = 0.3for all i. A flat interior policy applied uniformly — exactly what the bilinear structure says no individual task should land at. Interior values arise legitimately only as averages over heterogeneous-θ sub-tasks. Applying them uniformly violates the structure and underperforms. - Optimal routing (budget-aware). Per-task corner choice under the shadow-price-adjusted
α_eff_i = α + μ·g_i, withμsolved by binary search until the budget is met (or until even uniform self-automator overflows). Each task lands at the corner appropriate to its θ and the day’s binding budget. The aggregate(ū, v̄)across the day looks interior because different tasks land at different corners; the dashboard surfacesμbelow the strategy table so the user can see when the budget is biting.
The headline prediction (S1): optimal routing dominates max-AI by quality-per-attention and dominates always-self by attention efficiency, on the same c_AI. The gap between optimal routing on mid-tier AI and naive-flat routing on frontier AI is the empirical bound the model places on “workflow architecture > model capability.” Mechanism: optimal routing differentiates across tasks (different corners for different θ) and reroutes under budget pressure (shadow price μ); naive uniformly applies an interior policy that is structurally never the per-task optimum at any α.
6. Calibration anchors
Where the constants come from. None of these is precise; all are Stage-4 fitting targets.
α = 1— normalisation. Attention is the numéraire; all other costs are denominated in attention units.ε = 0.15— Mozannar CUPS (E12) shows verification + monitoring is a “substantial fraction” of total interaction time even when AI is doing the generation. 15% is a midpoint of the reported range; Stage 4 should tighten this from telemetry.β = 0.05— Bastani’s 17% unassisted-performance drop (E9) over a session of ~30 unguardrailed tasks → ~0.5% atrophy per task atu = 1, v = 0. Settingβ = 0.05means the model implies ~5% atrophy per task in the worst-case regime, which compounds to Bastani’s order-of-magnitude over a week. The lit review explicitly notes most studies are under 12 months; β at the per-task scale is what compounds to the longitudinal scale.M = 0.08— Tankelevitch (G1) finds metacognitive load is the binding constraint for AI users; CUPS (E12) finds verification + planning consume a meaningful fraction of total time. 8% per task is a calibration anchor consistent with the magnitude of the metacognitive-bottleneck claim. Stage 4 should test whetherMvaries by task type (it likely does — high-stakes strategic decisions have larger M than routine email).
At the per-task level, the defaults predict (1, 0) self-automator at routine-low-stakes corners (Randazzo’s ~10% empirically), (1, 1) spec-driven where verification benefit dominates verification cost, and (0, 0) do-yourself in outside-frontier regimes. The full Randazzo 60/30/10 (cyborg/centaur/self-automator) distribution emerges only at the aggregate level, when a worker’s day mixes corner policies across heterogeneous-θ sub-tasks — see §4. The defaults predict outside-frontier harm (Dell’Acqua E3) when c_H > c_AI and the worker mis-routes to u > 0 (the model’s prescription is u* = 0 there); the harm is from disobeying the optimum, not from a model output.
7. Worked anchors against the empirical record
Six probes. Each picks a parameter regime, computes the corner optimum exactly, and compares to the lit-review anchor.
A. Brynjolfsson (E1, E2) — customer-service novice +34%, top performers ~0%.
Novice: θ = (c_H = 0.40, c_AI = 0.70, φ = 0.20, σ = 0.20, λ = 0.10). c_⋆ = 0.70 + 0.30·0.40 = 0.82. Then K_u = 0.30 + 0.85 − 0.105 − 0.06 = 0.985 > 0 (full delegation beats do-yourself), and K_v + K_uv = −0.20 + (0.12 + 0.005 + 0.06) = −0.015 < 0 (verification cost barely exceeds benefit). Optimum: (1, 0) self-automator. Quality lift over always-self: c_AI − c_H = +0.30. Direction matches the +34% Brynjolfsson finding (which is in resolution rate, mixing speed and accuracy).
Expert: θ = (c_H = 0.85, c_AI = 0.70, φ = 0.20, σ = 0.20, λ = 0.10). c_⋆ = 0.70 + 0.30·0.85 = 0.955. Then K_u = −0.15 + 0.85 − 0.105 − 0.06 = 0.535 > 0, and K_v + K_uv = −0.20 + (0.255 + 0.005 + 0.06) = +0.12 > 0 (verification benefit dominates because c_⋆ − c_AI = 0.255 is large — a skilled human catches AI errors). Optimum: (1, 1) spec-driven. Quality lift: c_⋆ − c_H = +0.105, only +12% relative.
So the model produces: novice at full-delegate-no-verify (Q = 0.70 from c_AI alone), expert at full-delegate-full-verify (Q = 0.955 = AI augmented by skilled human catching). Both delegate; the difference is in verification depth. The empirical “expert ~0%” finding reflects throughput ceiling (experts already at maximum call rate, can’t redeploy saved attention to more calls) rather than zero quality lift — a context the model doesn’t carry.
B. Dell’Acqua BCG (E3) — outside-frontier 19-pp quality drop. θ = (c_H = 0.70, c_AI = 0.40, φ = 0.30, σ = 0.50, λ = 0.50). c_⋆ = 0.40 + 0.60·0.70 = 0.82. Then K_u = −0.30 + 0.85 − 0.525 − 0.30 = −0.275 < 0. Optimum: (0, 0) do-yourself. If the worker mis-routes to (1, 0), quality drops from c_H = 0.70 to c_AI = 0.40 — a 30 pp loss. The empirical 19 pp reflects partial mis-routing (some subjects partially used AI, some didn’t); the model’s prediction is a clean upper bound on the harm.
C. Bastani PNAS (E9) — guardrails preserve skill. Generic learning task with high skill-formation: θ = (c_H = 0.40, c_AI = 0.70, φ = 0.30, σ = 0.20, λ = 0.50). c_⋆ = 0.82. K_u = 0.30 + 0.85 − 0.525 − 0.06 = 0.565 > 0. K_v + K_uv = −0.30 + (0.12 + 0.025 + 0.06) = −0.095 < 0. Optimum: (1, 0) self-automator.
This is a real and honest finding: at lit-review-anchored constants (β = 0.05), the skill-preservation push toward verification (λ·β·u = 0.025 at u = 1) is too small to flip the corner against verification cost. The guardrail effect is structurally present — at the (1, 1) corner, S = 0 (no atrophy) versus S = −0.05 at (1, 0), so λ·ΔS = +0.025 — but the verification cost (α·φ = 0.30) dominates. Numerically, self-automator beats spec-driven by α·φ − K_uv = 0.30 − 0.205 = 0.095 net at default constants. To make guardrails decisive (flip the corner to spec-driven), the model needs λ·β > α·φ − [(1 − c_AI)·c_H + σ·(1 − c_AI)] = 0.30 − 0.18 = 0.12. At default λ = 0.50, β must exceed 0.24; at β = 0.05, λ alone cannot flip the corner (would require λ > 2.4, impossible since λ ∈ [0, 1]); at default β = 0.05 and λ = 0.50, lowering φ flips the corner once φ < 0.205 — barely cheaper verification than the default 0.30.
Honest reading: the model says the guardrail effect is real but weak at default constants. Stage-4 fitting target Q2 is whether β should be larger to match Bastani’s empirical magnitude. This is not a model failure — it’s the model surfacing a calibration question the lit review left implicit.
D. Everett (E8) — independent-then-synthesize restores complementarity. θ = (c_H = 0.65, c_AI = 0.70, φ = 0.40, σ = 0.90, λ = 0.30). c_⋆ = 0.70 + 0.30·0.65 = 0.895. K_u = 0.05 + 0.85 − 0.315 − 0.27 = 0.315 > 0. K_v + K_uv = −0.40 + (0.195 + 0.015 + 0.27) = +0.08 > 0. Optimum: (1, 1) spec-driven. Mechanism: the σ·(1 − c_AI) = 0.27 term in K_uv makes verification valuable precisely because stakes are high and AI is fallible. Matches Everett’s lit-review story exactly.
E. Randazzo self-automator (E10) — ~10% of consultants in the trap. Routine consulting task: θ = (c_H = 0.70, c_AI = 0.85, φ = 0.20, σ = 0.20, λ = 0.10). c_⋆ = 0.955. K_u = 0.15 + 0.85 − 0.105 − 0.03 = 0.865 > 0. K_v + K_uv = −0.20 + (0.105 + 0.005 + 0.03) = −0.06 < 0. Optimum: (1, 0) self-automator. Self-automator is the correct policy at this θ — the empirical finding “~10% of consultants are self-automators” is about θ-distribution (~10% of work-instances have these characteristics), not about systematic mis-routing. A misread of Randazzo’s data as “self-automator is always wrong” is a category error the model corrects.
F. Schoenegger (E18) — even overconfident GPT improves forecasting +23–43%. This is outside the model’s current formalism. The model attributes any gain at u > 0 to c_AI (advice quality), but Schoenegger’s finding suggests structured reasoning is doing significant work independent of the AI’s confidence calibration. A constant +δ to Q whenever u > 0 would represent this — held for future passes if it changes downstream predictions. Honest gap.
Cross-context note: outcome heterogeneity across these anchors
The six papers above measure different outcome variables. The model’s Q (“probability of correct/high-quality output”) maps cleanly onto Dell’Acqua, Everett, and Schoenegger (all output-quality measures). Brynjolfsson’s “issues resolved per hour” maps approximately onto Q × call-rate, where call-rate depends on attention saved (A); the model’s +34% novice prediction is a Q-only claim, while Brynjolfsson’s empirical 34% bundles throughput. Bastani’s “unassisted retest performance” is a downstream effect of the S channel accumulated over many task instances, not a per-task Q measurement. Randazzo’s “behavioural-mode distribution” is a categorical prediction over which corner the worker chooses, not a quality measure at all. Mozannar’s CUPS data is process telemetry (time fractions across interaction states), informative for α and ε calibration but not for Q.
Stage 4 must disaggregate which constants get fit against which outcome types — pooling them as a single calibration target would silently average over methodological apples and oranges. This is named explicitly as Q1–Q6 in §11.
8. The five cruxes
Load-bearing claims of the model. Collapse rebuilds it.
C1 — Two-axis decision space (u, v). The decision is reduced to autonomy and verification depth. If the actual workflow choice space has more dimensions that matter — e.g., context-engineering depth, prompt-iteration count, tool selection — the model is incomplete. What would flip it: empirical evidence that two workflows with identical (u, v) but different context-engineering produce systematically different outcomes (which the practitioner literature suggests is real — S4).
C2 — Verification effectiveness equals generation skill. c_⋆ = c_AI + (1 − c_AI)·c_H treats c_H as both generation skill and verifier capability — a worker who’d produce 40%-quality output alone catches 40% of AI errors. Karpathy’s G9 generator-verifier-asymmetry says verification is typically easier than generation; a separate c_V ≥ c_H parameter would make low-c_H workers more effective verifiers. What would flip it: empirical evidence that verifier-recognition rates are uncorrelated with generation skill (would require introducing c_V). Deskilled-verifier worry is partially handled by the formula already (c_H → 0 ⇒ c_⋆ → c_AI), but the mechanism of skilled-but-not-creative verifier (the editor archetype) is not in scope.
C3 — ε > 0 is the right operationalisation of L1. The substitution-myth invariant is captured as residual attention at full delegation. If the actual structure is more like “delegation creates new tasks of comparable effort” rather than “delegation creates monitoring overhead”, a constant ε is wrong shape. What would flip it: evidence that delegation-induced work scales with task complexity, not as a flat constant.
C4 — Skill atrophy β is task-type-uniform. All tasks atrophy at the same rate per unit of u·(1-v). Some skills (e.g., motor-procedural) atrophy slower than others (verbal-fluency, calibration). What would flip it: longitudinal data on skill-specific atrophy rates under controlled AI-use exposure.
C5 — Tasks are independent in the portfolio. Σ_i V_i aggregates linearly. Cross-task productivity bundling (G8/Cowen) violates this — productivity gains on related tasks are correlated, not additive. What would flip it: empirical evidence that observed aggregate productivity (E4 / Humlum-Vestergaard zero) is driven by task-coupling effects, not just by individual mis-routing. This is the most likely-to-flip crux: the aggregate-zero puzzle is a smoking gun for it.
9. Scope limits — what the model does NOT capture
Honest disclosure of where the formalisation stops.
- Aggregate-zero puzzle (E4 / O2). Humlum-Vestergaard’s zero across 25,000 Danish workers is organisational, not individual — task reorganisation, managerial absorption, coordination costs. The model is individual-level (the A6 crux of the topology); it cannot rebut or explain E4 directly. This is a sibling-artifact problem (organisational-level model), not a parameter to set. The model is locally optimal at the individual level; it is silent about whether aggregate effects emerge.
- Cross-task bundling (G8).
Vis summed across tasks; in realityV_iandV_jcovary when the tasks are productivity-linked. C5 names this; the model does not encode it. - Calibration error on
c_AI(O7).c_AIis treated as known. In practice users are systematically miscalibrated (E16 — higher AI confidence → less critical thinking; E13 — sycophancy escalation). The model’sc_AIshould be the user’s belief about AI capability; if belief and reality diverge, the model is locally optimal under a wrong belief. The topology’s O7 (verification cost vs. verification calibration) is the natural extension. - Sycophancy as quality-degrader (E13). The model assumes verification only helps. Randazzo HBS 26-021 documents AI flipping correct human judgments under pushback — verification can worsen outcomes when sycophancy escalates. Not encoded; would require a
c_⋆that depends on the human’s resistance to AI-pushback. - Frontier migration (O4).
c_AIis static within a session. Over months the frontier moves; the user must recalibrate. The model is a snapshot — extending it to a dynamic version would couplec_AI(t)to a learning model of the user’s frontier-mapping rate. - Multi-tool attention interference (G10 — Wickens MRT). The model is per-task; it does not capture the cost of running Cursor + ChatGPT + Slack simultaneously. The topology added G10 specifically to flag this; Stage 4 / Stage 5 should test whether a
M_concurrent(N_tools)extension is needed. - Verifier skill ≠ generation skill (Karpathy G9). The model uses
c_Has both generation capability and verification effectiveness. Empirically, verification is often easier than generation — recognising an error costs less than producing a correct answer. Ac_V ≥ c_Hparameter would let low-c_Hworkers benefit from verification more than the current formula predicts. Held for Stage 4 / future passes; the C2 crux names this. - Partial verification (“skim”) rounds to full or none. Bilinearity makes
v* ∈ {0, 1}. The empirical reality of skimming (partial-depth verification at reduced cost and reduced effectiveness) does not map onto the model — the model rounds skim up to full-verify when verification is cheap and down to no-verify when it’s expensive. A future variant with convex verification cost (v²·φinstead ofv·φ) or with a separate verification-depth-vs-effectiveness curve would produce interiorv*solutions. Not added in this pass to preserve bilinearity’s analytical clarity.
These eight are not bugs of the formalisation. They are the boundary of what an individual-task bilinear generator-verifier loop can carry. Beyond it lies organisational design, dynamic learning, and team-level cognition — sibling artifacts.
10. Adversarial + steelman
Three objections the formalisation has not yet engaged head-on, and the strongest version that survives each.
Objection 1: c_AI is unobservable, so the model is unactionable on novel tasks.
Steelman. The optimal policy depends on c_AI, but in any new task or unfamiliar domain the worker doesn’t know c_AI ahead of time. Calibrating c_AI requires running the task and verifying the output — but the model says “compute argmax V using c_AI you don’t have.” For experienced task types c_AI is calibratable from history; for genuinely novel tasks (much of knowledge work) it isn’t. The Vasconcelos / Fok & Weld verification-economics frame already captures this — engagement is rational when verification is cheap; verifying a single AI output IS your way of measuring c_AI for that task type. The model assumes the calibration question is solved when it’s actually the binding constraint.
Why partially right. For genuinely novel tasks where c_AI is unknown, the model cannot make a precise recommendation. Stage-4 fitting target Q3 (mode-distribution match against Randazzo BCG) implicitly assumes calibrated c_AI distributions across BCG-like tasks — only valid for well-studied domains.
Why the strongest version survives. The model doesn’t need precise c_AI; it needs robustness across c_AI ranges, and the corner structure is robust. For almost any c_AI < 0.4 in a high-stakes regime (σ ≥ 0.7), the corner is do-yourself; for almost any c_AI > 0.8 in low-stakes routine (σ ≤ 0.2, low λ), the corner is self-automator. The “interesting” boundary regions — where c_AI uncertainty matters — are precisely the spec-driven regions. So the model’s prescription under uncertainty is: set v = 1 to verify and learn. The spec-driven corner doubles as a Bayesian-update mechanism; the verification cost α·v·φ is the explicit price of resolving the uncertainty. Stage 4 should formalise the explore-vs-exploit dynamics this implies (Q6 in §11).
Objection 2: The model recovers practitioner intuitions and adds no new predictions.
Steelman. Mollick, Karpathy, Anthropic, and Cognition collectively say “match the workflow to the task.” The model’s predictions of corner solutions match Randazzo’s empirical mode distribution. So what is the formalism contributing beyond a formal scaffold for intuitions practitioners already had?
Why partially right. Many model predictions match practitioner consensus on headline conclusions. Self-automator for routine, do-yourself for novel-and-high-stakes, spec-driven for high-stakes-with-cheap-verify — practitioners already say these.
Why the strongest version survives. The model contributes five things the practitioner literature does not:
- Quantitative trade-offs. The model says how much worse self-automator is than spec-driven at default constants — at the Bastani anchor specifically,
0.095net (α·φ − K_uv = 0.30 − 0.205). Practitioner literature is qualitative; this is parametrically calibratable, and Stage 4 will pin the constants. - Non-obvious simultaneous constraints. Higher stakes (
σ ↑) push BOTHuDOWN (refuse AI for high-stakes work) ANDvUP (if you do use AI, verify carefully) — a simultaneous prescription the practitioner literature does not make explicit. §4’s comparative-statics surfaces this; intuition often conflates the two. - Naive cyborg as structural failure mode. Bilinearity says applying interior
(u, v)uniformly is what no individual sub-task should do. This is a sharper criticism than “be thoughtful about which mode you use” — it identifies a specific failure mode (the BCG cyborg majority running flat-(0.7, 0.3)policies) and says they’re structurally wrong, not just suboptimal. - Budget-aware shadow-price reformulation. The
μmechanism says: under attention scarcity, reroute longer-base-time tasks first (sinceα_eff_i = α + μ·g_irises proportionally withg_i). Practitioner literature has nothing like this prescriptive structure for portfolio-level decisions. - Stage-4 calibration targets. The model identifies six specific empirical questions (Q1–Q6 in §11) that fit a generator-verifier dispatch framework. Practitioner heuristics are unfalsifiable by design — they update without preserving the reasoning, so users can’t tell when a heuristic stops applying. The L3 invariant is the antidote.
The model recovers practitioner intuitions on top-line conclusions and adds quantitative, edge-case, and falsifiable structure beyond. The contribution is in calibration, simultaneous constraints, structural failure modes, portfolio-level shadow pricing, and Stage-4 testability — not in inventing new top-level recommendations.
Objection 3: The model is single-shot; the topic question is dynamic.
Steelman. “Optimal configuration for an individual knowledge worker” is implicitly a static answer to a fundamentally dynamic question — AI capability shifts month-over-month, the worker’s skill atrophies under sustained delegation, calibration on c_AI drifts as new model versions ship. A static dispatcher with parametric flexibility doesn’t tell the worker how to anticipate and prepare for capability change.
Survives because (a) Parasuraman, Sheridan & Wickens’ (2000) function-allocation framework has been useful for 25+ years despite being static — parametric statics IS what’s wanted from a generating function (the L3 invariant); (b) the trajectory questions properly belong to sibling topics in the LLM Iterate roster — navigating-ai-world for AI-induced trajectory of skill/meaning/relational channels, the planned prediction-calibration topic for c_AI calibration drift, the planned bedrock-generating-functions for the temporal-aggregation patterns this and other models share. Static-but-parameterised is the right scope here; dynamic extensions are cross-topic by design, not gaps in the present formalisation.
11. Stage-4 fitting targets
Six named questions Stage 4 should test against data.
Q1 — (α, ε) from CUPS telemetry. Mozannar CUPS gives time fractions across coding interaction states. Calibrate ε from observed monitoring-time-at-high-u; calibrate α from observed cyborg-vs-centaur time efficiency.
Q2 — β from Bastani longitudinal regime. Bastani’s 17% drop is one window. Longer-window data (Lee-Sarkar CHI 2025; Anti-Social Century) should pin per-task atrophy rate. The model predicts: β should be ~10× larger for skills the worker uses heavily than for tasks they delegate occasionally.
Q3 — Mode-distribution match against Randazzo BCG. Given a θ-distribution prior over BCG-consultant tasks, does the optimal-routing distribution over (u*, v*) match the observed (~60% cyborg / ~30% centaur / ~10% self-automator)? If not, which constant is mis-fit?
Q4 — Outside-frontier harm prediction (Dell’Acqua replication). For subjects forced to use AI on c_H > c_AI tasks, the model predicts a quality drop proportional to u·(c_H − c_AI). Stage 4 should test the linearity and the slope.
Q5 — Workflow-architecture-vs-capability bound. The headline S1 prediction. Construct two simulated workforces: (a) frontier AI + naive flat-cyborg routing, (b) mid-tier AI + optimal routing on the same task mix. The model predicts (b) outperforms (a) on net quality once c_AI_naive falls below some threshold. Stage 4 should locate the threshold and test against Everett 2025 / Dell’Acqua at that precision.
Q6 — Calibration uncertainty and explore-vs-exploit on c_AI. The model treats c_AI as a known input. In practice, workers learn c_AI by running the task and verifying the output (Vasconcelos verification-economics). For novel tasks, the spec-driven corner doubles as a calibration mechanism — verification reveals c_AI for next time. Stage 4 should formalise the explore-vs-exploit structure: what is the optimal exploration premium when c_AI variance is high? When does verifying-to-learn dominate verifying-to-quality-control? The §9 scope-limit on c_AI miscalibration becomes a structural extension via this question, not just an acknowledged gap. Engages the §10 Objection 1 directly.
Data starting points for each Q
The natural starting point for each fitting target is the lit-review paper(s) that motivated the corresponding parameter. Honest notes on likely data availability:
- Q1 (
α,ε) — Mozannar CUPS 2024 (E12) telemetry across coding interaction states. Process-level data is likely Microsoft Research-internal; replication via Cursor / Claude Code anonymized usage logs is the alternative path. - Q2 (
β) — Bastani PNAS 2025 (E9) is the primary anchor; PNAS supplementary materials likely include the longitudinal panel needed to estimate per-task atrophy. Lee-Sarkar CHI 2025 (E16) provides multi-task panel context for a complementary fit. - Q3 (mode distribution) — Randazzo HBS WP 26-036 (E10) for the 60/30/10 BCG distribution. HBS supplementary materials may include individual-level mode tags; failing that, an in-house replication on a smaller knowledge-worker sample is tractable.
- Q4 (outside-frontier slope) — Dell’Acqua BCG study (E3); HBS data release may include individual-level outcome grades plus inside/outside frontier classification per task. The linearity test of
u·(c_H − c_AI)is a clean within-subjects design. - Q5 (workflow > capability) — Everett 2025 (E8) demonstrates the workflow-restoration mechanism but does not directly test the bound. The cleanest test is a new RCT comparing routing strategies on the same
c_AI(e.g., GPT-4 + naive flat-cyborg vs. GPT-3.5 + optimal routing on a matched task mix); existing data is suggestive, not decisive. - Q6 (explore-exploit on
c_AI) — Likely requires new experimental work or simulation. Closest analogs are the contextual-bandit and multi-armed-bandit-with-costly-verification literatures, but the knowledge-workflow application is novel; this is a Stage-4 originated study, not a re-analysis of existing data.
Stage 4’s first move is to scope data availability for Q1-Q5; Q6 may require originating new data or a simulation harness. Following the human-psych-variation pattern, the Stage-4 build should live at stage_outputs/technology-utilization-architecture/data/ with a curated CSV per fitting target, a runnable Python pipeline that reproduces every chart on the published data.mdx, and a data/out/ folder for derived outputs.
12. Connections to other topics
Where this model attaches to sibling AI’s-Research topics.
- Human-psych-variation.
λ(skill-formation value) andβ(atrophy rate) are individual differences. Need-for-Cognition (Buçinca 2021, E6) moderatesv— high-NfC users verify more. Cognitive-style covariation belongs in that topic, not here. - Navigating-AI-world. This model’s
βis the per-task version of nav-AI’sΔM_comp(competence erosion). The portfolio-levelSaggregation is the within-work-domain version of nav-AI’sΔV/ΔMtrade-off. The two models share the substitution-myth (L1) and verification-economics (G3) invariants; they differ in what they’re optimising — nav-AI optimises a life-scale meaning budget, this model optimises a workday quality-per-attention budget. - Trust architecture (planned). Sycophancy (E13) and human-side calibration on AI capability (O7) are trust-regime questions — what feedback signals make
c_AIknowable. - Prediction & calibration (planned). The topology’s O7 connection. Calibration on
c_AIis the calibration sub-problem this model treats as exogenous. - Information fidelity (planned).
φ(verification cost) depends on output-format quality and grounding — verifying a structured output with citations is cheap; verifying free prose is expensive. The information-fidelity topic should formalise what makesφlow or high. - Bedrock generating functions (planned). The four-channel decomposition
V = Q − α·A + λ·S − σ·Ris a candidate generating-function pattern: every decision under attention scarcity has the same four channels. The bedrock topic should test whether this generalises beyond AI workflow.
Glossary
- autonomy level (
u) — fraction of a task’s generation delegated to AI. Karpathy slider P2 made parametric. - verification depth (
v) — fraction of AI output independently checked by the human. c_H,c_AI— human and AI capabilities on a task type; probability of correct/high-quality output.c_⋆— verified-output ceiling;max(c_H, c_AI)under the assumption the human can verify.φ— verification-cost ratio: time-to-verify divided by time-to-generate-from-scratch.σ— stakes; weight on uncaught-error penalty.λ— skill-formation value of the task to the worker.α— attention price (utility weight on time). Normalised to 1.ε— residual attention at full delegation. The L1 substitution-myth invariant.β— per-task skill-atrophy rate under unverified delegation.M— per-task metacognitive routing tax. The G1 metacognitive-bottleneck invariant.- centaur — Mollick: clean human/AI handoff with verification gate.
(u, v)mid + high. - cyborg — Mollick: interleaved sub-task delegation with partial verification.
(u, v)mid + mid/low. - self-automator — Randazzo: full delegation, no verification.
(u, v)high + low. The atrophy-trap regime. - spec-driven / independent-then-synthesize — Everett 2025; Compound Engineering: full delegation with full verification.
(u, v)high + high. - do-yourself — no AI involvement.
u ≈ 0. - L1 (substitution myth) — every offload creates new monitoring/verification work. Encoded as
ε > 0. - L2 (joint surface) — optimal allocation requires modelling the joint performance, not comparing solo capabilities. Encoded as the portfolio-level
argmax. - L3 (parameterise by capability) — the formalism is a generating function over θ, not a lookup table. The whole structure of the model.
- G3 (verification trade-off) — engagement is rational only when verification is cheap relative to expected payoff. The
−α·v·φterm. - G7 (skill atrophy) — capacities not exercised decay. The
−β·u·(1-v)term. - G9 (generator-verifier asymmetry) — production cost falls toward zero with AI; verification cost stays roughly constant. The asymmetry between
u·ε(generation residual) andv·φ(verification full).