Model pass 8

Model

Generating function for human psychological variation. One equation per person; variance decomposition follows. Closed-form pieces: Crow–Felsenstein AM inflation, Wilson-Effect saturation, genetic-nurture additive split, multivariate sex-difference Mahalanobis D. Twin / SNP / within-family heritability are projections of the same decomposition. Interactive dashboard included.

TLDR

The topology answered “what depends on what?”. The formalization answers a sharper question: given a person, where does their phenotype come from in expectation? The answer is a single generating function that, once written down, dissolves several apparent paradoxes in the field — most importantly the gap between twin heritability, SNP heritability, and within-family heritability (they estimate different sums of the same underlying components, and the differences are informative).

The spine of this stage is one equation. Phenotype P for a person in a population is P = A_d + A_i + A_LD + C + E_m + E_s + I, with each term a contribution from a distinct mechanism: direct genetic effects from the person’s own transmitted alleles, indirect genetic effects from parental (and broader-family) genomes operating through the environment they create, assortative-mating-induced linkage among causal variants, residual shared environment, measured non-shared environment, stochastic developmental noise, and gene-environment interaction terms. Variance decomposition follows directly, and is block-orthogonal rather than fully orthogonal: V(P) = ΣV(component) + 2·Cov(A_d, A_i) + 2·Cov(A_d, E_m) + 2·Cov(A_d, C) + V(I). The cross-terms are the formal home of every gene-environment correlation finding in the literature; pretending they are zero is the most common modeling error. Three closed-form pieces drop out — the Crow–Felsenstein assortative-mating partition V(A_LD) = h²_obs · r_δ with r_δ = m·h²_obs (the dashboard partitions h² rather than inflating it; the equilibrium is reached in 5–10 generations of stable assortment), the Wilson-Effect logistic curve h²(t) = h²_∞ / (1 + exp(−k·(t − t_50))), and the method gradient that says twin h² ≥ SNP h² ≥ within-family h² with the gaps decomposable into AM-LD, indirect-genetic, and rare-variant pieces.

A second module handles the multivariate sex-difference algebra, because the single largest framing trap in this field is the gap between univariate Cohen’s d (typically 0.2–0.6 across personality dimensions) and the multivariate Mahalanobis distance D² = Δμᵀ·Σ⁻¹·Δμ (which can hit 2.7 when traits are weakly correlated and you stack 15 of them, as in Del Giudice 2012). The same data, two numbers, opposite-sounding stories — both correct. The formalization makes the bridge explicit so the reader can dial univariate d’s and inter-trait correlations and watch D move.

What this stage does not formalize: the Plomin/Turkheimer interpretation of polygenic scores (verbal disagreement, no candidate equation), the mechanism behind the Gender Equality Paradox (three live hypotheses with no shared formalism), and the magnitude of AM-correction across the full cross-disorder genetic-correlation matrix (active research, methods just emerging). These remain at the observation stage; premature math here would mask uncertainty rather than reduce it. The L4 Lewontin firewall is preserved as a structural property of the model: the entire generating function is within-population, and nothing in it licenses between-population mean inference.

Inputs

Trait class

Age (years)25

mSpousal phenotypic correlation0.40

β_i/β_dIndirect / direct genetic ratio0.40

rareRare-variant share of direct0.10

Anchors

Variance decomposition

V(A_d) common48.7%

V(A_d) rare5.4%

V(A_LD) AM-induced25.2%

V(A_i) genetic nurture8.7%

V(C) shared env6.1%

V(E) non-shared5.9%

Method gradient

Twin h² (ACE)

0.79

A_d + A_LD

A_i lands in C

SNP h² (LDSC)

0.77

A_d,common + A_LD + ½·A_i

Within-family

0.54

A_d only

Assortative-mating partition

r_δ

0.317

m · h²

V(A_d) / h²

0.68

direct share of additive

Wilson h²(t) is the AM-equilibrium population heritability V(A_AM)/V(P). The Crow-Felsenstein partition splits V(A_AM) into V(A_d) (clean direct, what within-family designs estimate) and V(A_LD) (population-level linkage among trait-relevant alleles induced by non-random mating). Note that classical twin h² (Falconer) is *biased downward* relative to V(A_AM)/V(P) by factor (1 − m_A) — AM raises DZ correlation relative to MZ correlation — but is typically inflated upward by EEA violations and genetic-nurture leakage, with the net effect for socially-structured traits being upward overall. V(A_i) is added on top as the variance contribution of genetic nurture; the gap between empirical twin h² and within-family h² for socially-structured traits is dominated by genetic nurture and EEA, not AM (see the model stage §2.2 caveat).

How to read this stage

The dashboard above is the artifact. Everything below is the spec.

In plain language: when researchers report a “heritability of 0.50,” what is being claimed is that if you took the variance in a trait across a population and asked how much of it tracks genetic differences, half of it does. It is a statement about the population’s variance, not about any single person, and not about between-population differences. It says nothing causal beyond that — high heritability is fully compatible with large environmental effects (height is ~80% heritable and has risen ~10cm in a century). Different methods estimating “heritability” answer slightly different questions: twin studies pick up the broadest definition, within-family designs the narrowest. The gap between them is informative.

The stage formalizes that picture by writing one equation per person, decomposing it into named pieces, and showing how each measurement method projects onto a different subset of the pieces. Three closed-form sub-equations follow (assortative-mating inflation, Wilson-Effect age curve, genetic-nurture split). A second module addresses the same algebra applied to group differences — most prominently sex differences, but the framework is general. The dashboard lets you turn the knobs and watch the consequences.

You can read this top-down (TLDR → equation → closed forms → boundary conditions) or bottom-up (play with the dashboard, then come back to the equations when something surprises you). Either order works. The cruxes section at the end (§12) is where the load-bearing assumptions live; if any one fails, parts of the picture have to be rebuilt.

1. Move I’m making

This stage is a decomposition + generating function + integration, in that order:

Decomposition — orthogonalize phenotypic variance into mechanism-specific components, with explicit non-orthogonal Cov(G,E) and interaction terms as the principled exceptions.
Generating function — write the per-person phenotype as a deterministic function of those components plus stochastic noise. The variance decomposition follows by taking V(·) of the generating function.
Integration — show that twin, SNP, and within-family heritability estimators are projections of the same underlying decomposition onto different observable subspaces. The Wilson Effect, AM inflation, and genetic-nurture findings then read as motion of those projections, not as separate phenomena.

What’s not ready: anything in the topology marked O (open), and the polygenic-score causal-vs-summary debate, where the underlying disagreement isn’t yet a formal one.

2. The generating function

For a single person i in a population at developmental time t, sampled from a stable mating regime:

P_i(t) = A_d,i + A_i,i + A_LD,i + C_i + E_m,i + E_s,i + I_i  +  μ(t)

Term	Mechanism	Source identity
`A_d`	Direct genetic — additive effect of person’s own transmitted causal alleles, evaluated as if mating were random	`Σ_k β_k · g_{ik}` over causal SNPs k
`A_i`	Indirect genetic (genetic nurture) — additive effect of parents’ (and extended-family) genotypes operating through the rearing environment	parents’ PGS × environmental transmission coefficient
`A_LD`	Assortative-mating LD inflation — additional additive variance induced by linkage among causal variants from non-random mating	At AM equilibrium, `V(A_d) + V(A_LD) = h²_obs`; the partition is `V(A_LD)/h²_obs = r_δ`.
`C`	Shared environment residual — environmental effects shared by siblings not already captured by `A_i`. Adult personality: ~0. Education / religiosity / politics: nonzero
`E_m`	Measured non-shared environment — identifiable causes (lead, schooling, head injury, peer composition, nutrition)	each enters with a measured causal coefficient, e.g. lead: β ≈ −6.2 IQ pts per 1–10 µg/dL
`E_s`	Stochastic developmental noise — unmeasured non-shared variance: developmental contingencies, immune/microbial, microscale neural variation, measurement error	the unmodeled residual; ~50% of personality variance
`I`	Interaction terms — `G×E`, `G×G` (epistasis), `G×age`. As of 2025 evidence, generally small at PGS-by-environment scale; large only at extreme environmental insults	residual non-additivity
`μ(t)`	Population mean at age t — not a person-level term but the developmental trajectory the person grows through	calibrated to age-norm tables

Why this form: this is the additive-decomposition default of quantitative genetics extended with the two corrections that the 2018–2025 literature has installed into the field — separating A_d from A_i (Kong 2018, Young 2022) and separating A_d from A_LD (Border 2022, Yengo 2018, Wainschtein 2025). Earlier formulations folded A_i into A_d and A_LD into A_d and got the wrong answer about how much of the population-level genetic signal is direct biological causation. The within-family literature is what made these terms separately estimable.

Scope note — scalar trait, not g-loaded vector: P_i(t) is written as a scalar for one trait at a time. For cognitive ability, this collapses an underlying multi-ability structure (g + specific abilities, the CHC hierarchy) into a single phenotypic measure. The collapse is faithful when reporting g-loaded composite scores (e.g., full-scale IQ), and reasonable for any single primary ability. It is not faithful when the question is “how much of A_d for cognition is g versus specific abilities” — that requires a multivariate extension where each ability gets its own decomposition and g enters as a latent common factor across them. The topology’s foundational assumption A3 (g exists as a real dimension) lives at this level: the model below operates inside a single ability/composite and inherits g as a property of which ability is being measured rather than as a structural component. For sex differences (Module B, §3.4), the multivariate extension is necessary by construction; that’s why it appears as a separate module.

2.1 Variance decomposition

Taking variance of the generating function and tracking the cross-terms:

V(P) = V(A_d) + V(A_i) + V(A_LD)
     + V(C) + V(E_m) + V(E_s)
     + 2·Cov(A_d, A_i)        ← genetic nurture is correlated with direct effects (parents pass both)
     + 2·Cov(A_d, E_m)        ← active rGE: people select environments matching propensities
     + 2·Cov(A_d, C)           ← passive rGE residual (small once A_i is split out)
     + V(I)

The off-diagonal Cov terms are why “orthogonal decomposition” is the wrong frame for this system. The system is block-orthogonal: the additive components are roughly orthogonal to the residual environment but not to each other, and the cross-terms are the formal home of every gene-environment correlation finding in the literature. Pretending they’re zero is the single most common modeling error.

2.2 Heritability identities

Three quantities are estimable from data; each picks up a different subset of the variance terms. The mapping is more subtle than a casual reading of the literature suggests, and it is worth getting right because the public-discourse confusion about “twin studies overestimate” turns on this exact algebra.

The non-obvious point about twin h²: V(A_i) (genetic nurture) is shared identically by MZ and DZ co-twins, because they share the same parents. Under a correctly specified ACE model, this variance lands in C, not A. So a faithful classical twin model does not count genetic nurture as heritability. The empirical observation that twin h² > within-family h² (e.g., for EA: 0.40 vs ~0.15) is therefore not due to twin h² capturing A_i directly. It is mostly due to two model-misspecification leakages: the ACE assumption rDZ_A = 0.5 fails under assortative mating (true sibling additive correlation under AM is 0.5·(1+r_δ)), and the assumption that genetic nurture’s contribution is fully shared between siblings can fail if parents differentially treat MZ vs DZ pairs.

Estimator	What it estimates (correctly specified)	Practical leakages
Twin h² (classical ACE: `2·(rMZ − rDZ)`)	`V(A_d) + V(A_LD)`	Under unmodeled AM, some `V(A_i)` and `V(C)` bleed into A. Empirically, classical twin h² for EA exceeds within-family by ~0.20–0.25.
SNP h² (GREML, LDSC on population GWAS)	`V(A_d, common) + V(A_LD, common) + V(A_i, common)·attenuated`	Population GWAS effect sizes `β_pop = β_d + k·β_i` (where `k` is the AM coupling between transmitted and non-transmitted alleles), so SNP h² is inflated by some `V(A_i)`, but attenuated relative to the full `V(A_i)` because `k < 1`. Excludes rare variants.
WGS h² (Wainschtein 2025)	SNP h² + `V(A_d, rare) + V(A_LD, rare)`	Closes the rare-variant gap; same `A_i` contamination as SNP h² unless within-family.
Within-family h² (sib-FE, MZ-discordant, parent-offspring trios)	`V(A_d)`	Removes `A_i` and `A_LD` cleanly; leaves direct additive only. With WGS: `V(A_d) + V(A_d, rare)`.

This is the method gradient (S2 in the topology):

twin h² ≥ WGS h² ≥ SNP h² ≥ within-family h²

The gaps are not measurement error. They are the data’s way of telling you how much of “heritability” is structural (AM-LD), how much is environmental-via-parents (A_i), and how much depends on rare variants common-variant arrays cannot tag.

For educational attainment in 2025: classical twin h² ≈ 0.40, common-variant SNP h² ≈ 0.20–0.25, WGS h² ≈ 0.30 (with rare-variant contribution), within-family additive ≈ 0.15.

Important caveat on Falconer’s AM bias (added in pass 6 after a reviewer correction). The “What it estimates” column above is exact only under random mating. Under positive AM, Falconer’s formula 2·(rMZ − rDZ) is biased downward by factor (1 − m_A) where m_A ≈ m·h² — fraternal twins share more than 50% of trait-relevant alleles because their parents are genetically more similar than chance, raising rDZ relative to rMZ and shrinking the formula’s output. So Falconer estimates [V(A_d) + V(A_LD)] · (1 − m_A) / V(P), not V(A_AM)/V(P) directly.

This matters for interpreting the gap. When classical twin h² > within-family h² for socially-structured traits (EA: 0.40 vs 0.15), the gap is dominated by other classical-ACE biases — primarily the equal-environments assumption (MZ co-twins are treated more similarly than DZ co-twins, inflating MZ correlation) and genetic-nurture leakage (V(A_i) leaks into A under model misspecification rather than landing cleanly in C) — partially offset by the AM downward bias. AM is a real phenomenon at the population level (Crow-Felsenstein V(A_LD) inflation; see §3.1 below) but it does not, on net, drive the twin-vs-within-family gap. The dominant inflation source is genetic nurture and EEA, with direct empirical anchors in Kong 2018 (non-transmitted PGS effect = 29.9% of transmitted for EA) and Okbay 2022 EA4 (within-family direct ~50% of population PGI). Within-family designs control for AM, EEA, and genetic nurture simultaneously.

This is the single calculation a careful reader of “twin studies vs molecular studies” headlines should be able to do. The numbers don’t disagree; they answer different questions.

3. Closed-form pieces

Three components admit clean equations. The rest are calibrated empirically.

3.1 Assortative-mating inflation (Crow–Felsenstein)

There are two ways to use the AM-inflation formula, and they answer different questions.

Forward problem (rarely the relevant one): given the random-mating heritability h²_rm of a trait, what is the equilibrium heritability after stable AM? The answer is a fixed-point coupling r_δ = m · h²*, V_A* = V_A / (1 − r_δ), h²* = V_A* / (V_A* + V_E), reached in ~5–10 generations of stable assortment (Crow & Felsenstein 1968). One-iteration approximation: r_δ ≈ m · h²_rm, inflation ≈ 1 / (1 − m·h²_rm).

Inverse / partition problem (what the dashboard does): given the AM-equilibrium population additive variance V(A_AM)/V(P) = h²_obs, partition it into the random-mating-equivalent direct component V(A_d) and the AM-induced LD inflation V(A_LD):

r_δ      ≈ m · h²_obs
V(A_d)   = h²_obs / (1 + r_δ + r_δ² + …)  =  h²_obs · (1 − r_δ)
V(A_LD)  = h²_obs − V(A_d)  =  h²_obs · r_δ

The Wilson curve gives h²_obs(t) directly, so the partition uses r_δ = m · h²_obs(t) with no iteration needed. This is what the dashboard implements. Pass-2 versions of the dashboard erroneously inflated h²_obs on top of itself, pushing twin h² above 1.0 at high parameter values; pass-4 corrected this.

Note on what h²_obs should represent here (added in pass 6 after a reviewer correction). The partition formula is a clean population-level decomposition of V(A_AM). Different estimators recover V(A_AM)/V(P) with different biases: SNP-based heritability (GREML / LDSC on unrelated individuals) recovers it approximately unbiased; classical twin h² (Falconer) recovers V(A_AM)/V(P) · (1 − m_A) — biased downward by AM, partially offset upward by EEA violations and genetic-nurture leakage. The dashboard’s Wilson-fit twin estimates conflate these biases. The partition formula’s empirical validation is the match against SNP-based AM-LD estimates: Yengo 2018 measures V(A_LD)/V(A) for height at 14–23% empirically, matching the formula’s prediction of m·h² = 20%. For socially-structured traits where Falconer twin h² is itself substantially inflated by EEA + genetic nurture, applying the partition formula to twin h² over-attributes the partition share to AM relative to its true population-level magnitude.

Worked anchors:

Educational attainment with m ≈ 0.4, h²_obs ≈ 0.40 (twin) → r_δ ≈ 0.16, V(A_d) ≈ 0.34, V(A_LD) ≈ 0.06. Caveat: the h²_obs ≈ 0.40 here is the Falconer twin estimate, which is a biased proxy for the AM-equilibrium V(A)/V(P); applying the partition to the SNP-based estimate (~0.13) would give a smaller absolute V(A_LD).
Height with m ≈ 0.25, h²_obs ≈ 0.85 → r_δ ≈ 0.21, V(A_d) ≈ 0.67, V(A_LD) ≈ 0.18 — matches the 14–23% empirical “AM-inflated” share Border et al. and Yengo et al. report (this is the trait where the partition’s empirical validation is cleanest, because Falconer-bias vs SNP-h² discrepancies are smaller for height than for socially-structured traits).

Cross-trait AM (m_xy ≠ 0) extends the same logic to off-diagonal entries of the genetic-covariance matrix and is the formal reason E7 finds R² = 0.74 between phenotypic-cross-mate correlations and genetic-correlation estimates. The cross-trait AM result (Border 2022) survives independently of the within-trait Falconer-bias issue: it’s about between-trait LD inflating reported genetic correlations between disorders, which is empirically validated and not in dispute.

3.2 Wilson-Effect saturation curve

Heritability of cognitive ability rises with age because active rGE (G1) compounds: as children gain agency, they select environments matching their genetic propensities, amplifying genetic variance and shrinking shared environment. The empirical age curve from Bouchard 2013 and Briley & Tucker-Drob 2013 is sigmoidal — slow rise in early childhood, fastest gain in late childhood / early adolescence, saturation in late adolescence. A logistic gives a clean three-parameter fit:

h²(t) = h²_∞ / (1 + exp(−k_h · (t − t_50)))

With h²_∞ ≈ 0.80, t_50 ≈ 9 years (age at half-asymptote), and k_h ≈ 0.30/year: h²(5) ≈ 0.19, h²(10) ≈ 0.46, h²(15) ≈ 0.69, h²(25) ≈ 0.79. These match Bouchard’s anchors within ~3 percentage points across the full developmental range.

(Earlier passes used a saturating-exponential h²_∞ − (h²_∞ − h²_0)·exp(−k·t), which rises too fast at the young end — it produced h²(5) ≈ 0.52 for cognition vs the empirical ~0.20. The logistic form is the smallest functional change that fits the empirical sigmoidal pattern.)

The shared-environment trace runs an inverse path with a non-zero asymptote, since shared environment for cognition does not actually drop to zero in adulthood (~0.05 plateau is well-attested):

c²(t) = c²_∞ + (c²_0 − c²_∞) · exp(−k_c · t)

Cognition: c²_0 ≈ 0.50, c²_∞ ≈ 0.05, k_c ≈ 0.15/year. For Big Five personality, c²_∞ ≈ 0 is appropriate (shared family environment effectively vanishes for personality by adulthood). For educational attainment and religiosity, c²_∞ ≈ 0.10–0.15 should be substituted — these are exception traits where shared environment persists throughout life.

Both formulas are phenomenological — the parameters are not derived from a deeper model. They are calibration knobs for the dashboard.

3.3 Genetic-nurture decomposition (additive form)

Define g_T as the offspring’s transmitted-allele PGS and g_NT as the parental non-transmitted-allele PGS. Then:

A_d = β_d · g_T
A_i = β_i · g_NT

Empirically (Kong 2018, Wang 2021, Okbay 2022, Howe 2022):

β_i / β_d ≈ 0.3 – 0.5  (educational attainment)
β_i / β_d ≈ 0.0 – 0.1  (height, BMI)
β_i / β_d ≈ 0.4 – 0.6  (cognitive performance)

β-level vs variance-level. The ratio β_i/β_d quoted above is at the regression-coefficient level. Translating to a variance contribution requires squaring (for the pure variance term) and an explicit cross-term:

V(A_i)              = β_i² · V(g)         =  (β_i/β_d)² · V(A_d)
2·Cov(A_d, A_i)     = 2·k · β_d · β_i · V(g)  =  2·k · (β_i/β_d) · V(A_d)

Where k is the AM-induced correlation between an offspring’s transmitted-allele PGS and the parental non-transmitted-allele PGS. Under random mating k ≈ 0 (Mendelian segregation makes them independent). Under stable AM, k > 0 because spousal phenotypic correlation creates correlation between mom-transmitted alleles and dad-non-transmitted alleles (and vice versa); for AM-strong traits (EA, height) k is empirically in the 0.1–0.5 range, depending on the strength and stability of assortment.

This means 2·Cov(A_d, A_i) is not generally larger than V(A_i). For EA with β_i/β_d ≈ 0.4, V(A_d) ≈ 0.15, and a moderate k ≈ 0.2: V(A_i) ≈ 0.024, 2·Cov ≈ 2·0.2·0.4·0.15 = 0.024. Total genetic-nurture variance contribution ≈ 0.048 — modest, on the same order as V(A_i) itself.

The dashboard displays only the pure V(A_i) = (β_i/β_d)² · V(A_d) slice as a clean variance bucket. The cross-term 2·Cov(A_d, A_i) is the leakage path that makes empirical twin h² (under unmodeled AM) exceed the dashboard’s “Twin h² (ACE)” output. It is acknowledged in the help text rather than allocated to a separate bar segment, partly because k is poorly constrained empirically and partly because adding a cross-term slice would over-clutter the visualization without changing the qualitative picture.

The relation V(A_i) + 2·Cov(A_d, A_i) ≈ V_PGS,population − V_PGS,within-family is approximate but useful — it turns “missing heritability after within-family correction” from a puzzle into an order-of-magnitude measurement. The exact RHS is (β_i/β_d) · V(A_d) · (2k + (β_i/β_d)), which depends on k and degrades to small values when AM is weak.

3.4 Multivariate sex-difference algebra (Module B)

For a trait vector x with covariance matrix Σ and group means μ_F, μ_M, the multivariate effect size is the Mahalanobis distance:

D² = (μ_F − μ_M)ᵀ · Σ⁻¹ · (μ_F − μ_M)

For uncorrelated traits with equal univariate effect sizes |d|, D² = n·d² so D = d·√n. For correlated traits, the inverse covariance structure either amplifies or shrinks D depending on whether sex-difference vectors are aligned with high-variance or low-variance directions of Σ.

Worked example. Take 15 personality dimensions (16PF), univariate |d| ≈ 0.5 on average, with positive inter-trait correlations averaging ρ ≈ 0.20. Then approximately:

D² ≈ d² · 1ᵀ · Σ⁻¹ · 1
   ≈ d² · n / (1 + (n − 1)·ρ̄)        if Σ has a constant-correlation structure
   ≈ 0.25 · 15 / (1 + 14·0.20)
   ≈ 0.25 · 3.95
   ≈ 0.99
   D ≈ 1.0

The equicorrelated approximation undershoots Del Giudice 2012’s reported D = 2.71, and this gap is informative rather than a bug. To recover 2.71 in the equicorrelated form would require average univariate |d| ≈ 1.3, far above what 16PF or NEO papers report at the observed level. What Del Giudice actually did was use multigroup latent-variable modeling with measurement-error disattenuation: he corrected each factor’s d for unreliability and then computed D on the latent (true-score) means. Disattenuation magnifies effect sizes when reliability is well below 1.0, and aggregating across 15 factors then compounds the magnification. The honest summary is: at the level of observed (raw, pre-disattenuation) measurement, multivariate sex-difference D for personality is ~1.0–1.5; Del Giudice’s 2.71 is the latent-true-score analogue.

The intuition behind the algebra still holds: if men and women differ on dimensions that are weakly correlated with each other, every dimension contributes independent information, and D grows with √n. If they differ on highly correlated dimensions, the differences carry redundant information and D plateaus. But the gap between observed and disattenuated D is itself a substantive piece of the field’s debate — and worth flagging rather than papering over.

Why this matters for distortions. D3 (the “gender similarities” framing) cites univariate d ≈ 0.05 for math performance and reads it as evidence of broad similarity. D4 (pop-evpsych framing) cites multivariate D ≈ 2.71 and reads it as evidence of broad difference. Both citations are correct. The bridge equation shows that they are about different objects: a single dimension vs. a 15-dimensional space. Anyone who hasn’t internalized this algebra can be silently captured by either framing.

The algebra is general — not just for sex. D² = (μ_A − μ_B)ᵀ · Σ⁻¹ · (μ_A − μ_B) applies to any two-group comparison: sex, age cohort, occupational sample, clinical vs. control, urban vs. rural — anywhere group means are reported on a multivariate panel. The module is presented in sex-difference language because that is where the framing trap concentrates, but readers thinking about other group comparisons can use the same dashboard. The L4 firewall (§5.2) does not block this generalization at the within-population level; it only blocks the leap from within-population variance/distance estimates to between-population causal claims. A descriptive D between two samples is fine; a causal interpretation of that D requires assumptions the model does not provide.

3.5 PGS portability decay (deferred)

Topology Variant C: accuracy(distance) calibration from Ding et al. 2023 (r = −0.95 between genetic distance and PGS R² across 84 traits) is a clean candidate for closed-form. Deferred to a future tool because it sits at the population-genetics boundary rather than the within-population generative process this stage formalizes. Listed as a follow-up.

4. Composing the parts: anchors the dashboard preserves

The dashboard above stitches §3.1, §3.2, and §3.3 into one panel — sliders for trait class, age, m, β_i/β_d, and rare-variant share; outputs the variance decomposition and the three method-specific h² numbers. Four sanity-check anchors hold under the calibrated defaults:

IQ at age 5 (cognitive): h²(5) ≈ 0.18, V(C) ≈ 0.26, V(A_d) ≈ 0.17. Matches Bouchard 2013.
IQ at age 25 (cognitive, m=0.4, β_i/β_d=0.4): h²(25) ≈ 0.79, V(A_d) ≈ 0.54, V(A_LD) ≈ 0.25, V(A_i) ≈ 0.09, V(C) ≈ 0.06. Within-family h² (= V(A_d)) ≈ 0.54 — about a third more than the often-quoted EA within-family of 0.15, because cognition is a higher-h² trait than education.
Big Five across adulthood: h² ≈ 0.45, V(C) ≈ 0, V(E) ≈ 0.55. Effectively flat from age 5 onward.
Variance budget closes: V(A_d) + V(A_LD) + V(A_i) + V(C) + V(E) = 1.0 by construction. Twin h² never exceeds the Wilson asymptote.

These are the calibration targets. The biggest non-obvious one is anchor 4: the previous dashboard pass had the variance budget overflow under default parameters (twin h² > 1.0 at age 25 with m=0.4), which was a real bug. The current partition h²_obs = V(A_d) + V(A_LD) keeps the budget bounded by construction.

For traits the dashboard does not have a dedicated class for (educational attainment, height, religiosity, political affiliation), the user can approximate by choosing the closest class and adjusting sliders. EA-like behavior emerges from cognitive with m=0.4 and a mental note that h²(25) for EA is closer to 0.40 than 0.79 — i.e., the dashboard’s cognitive class is calibrated to IQ, not EA.

5. Boundary conditions and where the model breaks

The generating function is correct only inside its scope. Five boundaries are explicit:

Severe psychiatric tail. The hyperpolygenic A_d = Σ β_k g_{ik} form assumes thousands of small effects. For early-onset autism with intellectual disability, single rare variants (CHD8, SCN2A) can carry effects of d > 1.0. The decomposition still works component-by-component but A_d becomes dominated by a small number of large-effect alleles — effectively Mendelian rather than polygenic. The model should either widen its prior on individual β_k or hand off to a separate Mendelian module at the tail.
Between-population mean differences (L4 firewall). Every term in the generating function is defined within a population at a stable mating regime. The model is structurally silent on between-population means: there is no μ_pop term to compare. Computing D² = (μ_pop1 − μ_pop2)ᵀ Σ⁻¹ (μ_pop1 − μ_pop2) is mathematically possible but requires assuming Σ_pop1 = Σ_pop2 and equal causal architecture across populations — neither of which is empirically supported (Ding 2023’s PGS-portability collapse is the empirical evidence that the assumption fails). This is the L4 / Lewontin firewall encoded directly into model scope.
Severe environmental insults. V(I) (interaction) is small at PGS-by-environment scale but large when environments cross threshold (lead, alcohol, severe deprivation, iodine). The additive decomposition under-fits at thresholds. Use the model in the normal range; switch to an explicit threshold-effect model at the extreme.
Non-equilibrium AM. The Crow–Felsenstein formula assumes AM has reached equilibrium. For populations under rapidly changing assortment regimes (e.g. rapid shifts in educational stratification), the inflation factor is en-route to the equilibrium value, not at it. Use the formula as an upper bound under those conditions.
Individual-level inference (L1). V(A_d) is a population variance. For a single person, A_d is a realization, not a partition. Statements like “70% of this individual’s intelligence is genetic” do not type-check against the model. The dashboard exposes population variance only.

6. Distortion-aware reading

Each component of the decomposition has a public-discourse failure mode. The model’s job is to make the failure visible, not to suppress it.

Component	Common misreading	What the model says
`V(A_d)` (high)	“Genes determine outcomes”	Population variance. Says nothing about a specific person’s prospects.
`V(A_i)` (large)	“Family environment doesn’t matter”	The opposite: this term is family environment, mediated by parental genotypes that correlate with parental phenotypes.
`V(A_LD)`	Usually invisible to public discourse	Inflates V(A) at the population level by ~10–25% via AM-induced LD between trait-relevant alleles (Yengo 2018: 14–23% for height, matching the formula prediction). Does NOT on net inflate Falconer twin h² — AM actually biases Falconer downward, partially offsetting other classical-ACE biases (see §2.2 caveat).
`Cov(A_d, E_m)` (active rGE)	“People shape their environments” → therefore environments don’t matter	They matter — the covariance term is their effect, just non-orthogonal to genes.
Twin h² ≥ within-family h²	”Twin studies overestimate”	They estimate a different quantity (population additive variance vs. direct effect). Both are real.
Multivariate `D` large	”Sexes are categorically different”	`D` is a distribution distance; individuals across the distributions still overlap substantially. Dimensional, not taxonic.
Univariate `d` small	”Sexes are essentially the same”	True for the dimension cited, false in the multivariate space.

D1 and D2 (the two heaviest distortions) both operate by selecting a subset of these readings. The model doesn’t resolve the political dispute, but anyone running the dashboard should be able to see why each side is technically correct about the term they’re highlighting and incomplete about the rest.

7. Adversarial + steelman

Four objections to the formalization itself. The strongest version of each, then the model’s honest response.

Objection 1 — Variance bookkeeping is not a causal model

The decomposition partitions variance into named components, but it never specifies why A_d produces phenotype P rather than the reverse. A regression coefficient β_d from a within-family GWAS is not a causal effect; it is a statistical association under specific design assumptions. Calling the decomposition a “generating function” is false advertising — it generates expected variance given parameters, not actual phenotype given a causal mechanism.

Steelman: This is the strongest objection because it is the same disagreement that drives O1 (Plomin vs Turkheimer). The model accommodates both readings rather than picking one: under the Plomin reading, β_d is a causal coefficient and the decomposition is generative in the strong sense; under the Turkheimer reading, β_d is a regression coefficient that happens to be unbiased under within-family identification, but the underlying biology is unspecified. Both readings predict the same variance budget, which is why the data hasn’t yet decided between them.

Response: Acknowledged. The model is more accurately described as a conditional variance generating function — given parameters, it generates the expected variance pattern. The causal interpretation of those parameters is exactly what’s contested, and the decomposition’s value is precisely that it lets both interpretations be expressed in a shared language. Stage 4 (data) is where the disagreement gets sharper: the test is whether β_d, within-family moves under environmental intervention. Plomin predicts no, Turkheimer predicts yes, and the model can express both predictions cleanly.

Objection 2 — ACE assumptions are unrealistic enough that “twin h²” is not really estimating anything physical

EEA fails (MZ co-twins are treated more similarly than DZ); shared-environment effects vary by zygosity; non-shared environment for siblings is correlated with shared parental treatment. Stack the violations and the entire ACE framework is just a parametric reparameterization of the data, not an estimation of underlying components.

Steelman: Joseph and Richardson’s critique of behavior genetics rests partly on this argument: the assumptions that make twin h² meaningful are violated enough to make the resulting numbers epistemically empty. The strongest version isn’t that twin studies are “wrong” but that they’re under-determined — multiple causal worlds produce the same rMZ and rDZ patterns.

Response: Partially conceded. Classical ACE is under-determined and the assumption-violation issue is real. However, two empirical findings constrain the under-determination: (a) SNP-based heritability (which uses unrelated individuals and bypasses EEA entirely) recovers a substantial fraction of twin h² across major traits — for height about 60% with common SNPs alone (rising to ~80% with whole-genome sequencing that captures rare variants), for cognitive ability about 25–40%, for educational attainment about 30–50%. The fraction varies by trait but is consistently non-trivial — the EEA-bias-only explanation for twin h² is empirically untenable; (b) MZ-reared-apart studies (Bouchard 1990 and updates) reproduce the basic Wilson Effect pattern with EEA structurally absent. The model takes twin h² as an upper bound on direct + indirect additive variance, not as a precise estimate. The method gradient is what makes the imprecision survivable: comparing twin to within-family bounds the gap.

Objection 3 — Additive form misses dominance and epistasis

Dominance variance V_D and epistatic variance V_I (gene-gene interactions) are real and measurable. Twin studies fitting ADE models routinely find non-trivial V_D. The additive-only generating function is a simplification that loses information.

Steelman: For some traits — height (V_D ≈ 0), educational attainment (V_D ≈ 0–0.05) — dominance is small and the additive simplification is fine. For others — psychiatric disorders, where ADE models often outperform ACE — non-additive variance is potentially substantial. The additive-only model is not “wrong” so much as inappropriate for that subset of traits.

Response: Concede the scope limit. The generating function as written is for traits where polygenic-additive architecture dominates (which is most psychological traits, per Hill, Goddard & Visscher 2008’s argument that even where dominance exists, additive variance often captures most of the variance because of allele-frequency distributions). For severe psychopathology and other traits with substantial V_D, an extended model would replace A_d with A_d + D_d and add Cov(A_d, D_d) cross-terms. The dashboard does not currently expose this; the prose acknowledges the boundary in §5.

Objection 4 — Multivariate `D` conflates measurement structure with reality

Mahalanobis D depends on Σ. Σ is the within-sex covariance of measured traits, which depends on which traits you measure, how you measure them, and how the population varies. Different measurement panels produce different Ds for the same underlying difference. The disattenuated D = 2.71 from Del Giudice is not a property of human nature; it’s a property of the 16PF + the U.S. sample + the latent-variable model.

Steelman: This is correct and underappreciated. D is not a population parameter in the way that μ_F − μ_M is. It is a model-relative summary statistic. Two researchers using different but equally defensible measurement panels can produce D values that differ by a factor of two or more.

Response: Conceded fully. The multivariate-D module’s value is comparative, not absolute. It tells you given a measurement structure, how multivariate aggregation magnifies the apparent sex difference relative to any single dimension. The module’s pedagogical purpose is to show why the same data (panel of dimensions) supports both “small per-dimension differences” and “large multivariate distance” — neither claim is wrong, but neither is the whole answer. The dashboard surfaces the dependency on Σ via the ρ̄ slider so users can see how D moves under different correlation structures.

8. Open questions that the model exposes (Stage-4 inputs)

The formal apparatus makes four open questions sharper than verbal discussion alone:

O1 (PGS interpretation). The decomposition treats β_d · g_T as a direct genetic term. Plomin’s “PGS is a real biological cause” reading takes β_d as a structural causal coefficient. Turkheimer’s “PGS is a summary of correlated environments” reading says β_d is contaminated by uncontrolled Cov(A_d, E_m). The two interpretations make different predictions about how β_d should change under environmental intervention. Stage 4 question: for traits with large enough within-family GWAS, does β_d, within-family move under intervention (schooling reform, nutrition shifts) the way Plomin predicts (it shouldn’t) or the way Turkheimer predicts (it should)?
O3 (Gender Equality Paradox). The multivariate algebra in 3.4 shows that D depends on the inter-trait correlation structure Σ. If Σ differs between high-equality and low-equality societies, D will differ even if univariate μ_F − μ_M differences are fixed. Stage 4 question: does Σ (the personality covariance matrix itself) change across societies, or only the means? This is a different empirical question than “are the differences innate.”
O6 (what E_s actually is). The model treats stochastic developmental noise as an unmodeled residual. As Stage 4 data accumulates, candidates (immune/microbial, peer-network, epigenetic, measurement error) can be peeled off into E_m and the residual E_s should shrink. Stage 4 question: how much of the current ~50% personality E_s can be moved into E_m given current measurement panels?
O7 (cross-disorder rg post-AM correction). Module 3.4’s bridge between cross-trait phenotypic correlations and genetic correlations under AM (Border 2022, LAVA-Knock 2024) gives a formal correction. Stage 4 question: applied at scale to the full psychiatric-disorder rg matrix, what fraction of the cross-disorder genetic correlations survive the correction?

The two questions deferred from Section 1 (PGS portability and the GEP causal mechanism) are not sharpened by the model — they require new measurement, not new math.

9. Handoff to Stage 4 (data pipeline)

The model defines five parameter sets that Stage 4 needs to populate:

Parameter	Source	Trait coverage
`β_d, β_i`	Within-family GWAS (Howe 2022, Okbay 2022)	EA, height, BMI, cognitive ability, depressive symptoms, smoking — extending
`m` (cross-spouse phenotypic correlation)	UK Biobank, HUNT, MoBa	EA, height, BMI, cognition, neuroticism — well-covered
`h²(t)` calibration	Bouchard 2013, Briley & Tucker-Drob 2013 longitudinal twin	Cognition (well-covered); personality (sparse); psychopathology (very sparse)
`Σ` for sex-difference module	Del Giudice 2012, Schmitt 2008, Kaiser 2020	16PF, NEO, Big Five
`share_rare`	Wainschtein 2025	Height, EA, several psychiatric — extending

The single highest-value Stage-4 deliverable: a per-trait table of (twin h², SNP h², WGS h², within-family h², m, β_i/β_d) at adulthood, ideally with cohort-by-age stratification. Most of the components already exist in published consortium summaries; the table is mostly aggregation, not new analysis.

10. Connection to adjacent topics

Parent-to-Child Transmission (planned). The A_i term is the formal answer to “how much does parenting matter beyond genes for outcomes that look genetic.” That topic should adopt this generating function as its starting point and refine β_i by domain (cognition vs. personality vs. health behaviors) and by mechanism (vocabulary input, expectation-setting, neighborhood selection). The Nivard et al. 2024 finding — that indirect genetic effects extend beyond the nuclear family — implies β_i should be further decomposed into a parent-level term and a dynastic/extended-family term.
Evolution-Modernity Mismatch (planned). The μ(t) population-mean trajectory is the formal home of secular shifts (Flynn rise, Flynn reversal, age-of-puberty drift). Within-cohort within-sibship designs are the cleanest separator of genuine environmental shifts in μ(t) from compositional or selection artifacts. The Pietschnig 2024 finding that the positive manifold itself may be weakening across recent cohorts suggests μ(t) is not a one-dimensional curve but a moving structure of which abilities are gaining or losing — which the current scalar form does not capture.

(A connection to a planned “Bedrock Generating Functions” topic was floated in pass 1 but dropped — the analogy was real but too loose to do useful work here, and any cross-domain claim should live in that topic’s own formalization rather than be asserted from this one.)

11. Glossary (formalization-specific additions)

This section’s symbols are listed in the order they appear in the generating function. The lit-review and topology glossaries cover the field-level terminology (h², SNP, GWAS, PGS, AM, rGE, GxE, etc.) and are not duplicated here.

Symbol / term	Meaning
`P_i(t)`	Phenotype of person `i` at developmental age `t`. Scalar (for one trait at a time); see §2 scope note for the multi-ability extension.
`A_d`	Direct additive genetic component — `Σ_k β_k · g_{ik}` over causal SNPs the person inherits, evaluated as if mating were random.
`A_i`	Indirect additive genetic component (genetic nurture) — additive effect of parents’ (and extended-family) genotypes operating through the rearing environment.
`A_LD`	AM-induced LD inflation — additional additive variance from non-random mating creating linkage among causal variants.
`C`	Shared-environment residual not already absorbed by `A_i`.
`E_m` / `E_s`	Measured non-shared environment (lead, schooling, etc.) / stochastic developmental noise (the unmodeled residual).
`I`	Interaction terms: `G×E`, `G×G` (epistasis), `G×age`.
`μ(t)`	Population-mean trajectory at age `t` (developmental norm, not a person-level term).
`β_d` / `β_i`	Direct / indirect genetic regression coefficients on phenotype, estimated from within-family / parental-genotype designs.
`g_T` / `g_NT`	Polygenic score from offspring’s transmitted alleles / parents’ non-transmitted alleles.
`m`	Cross-spouse phenotypic correlation (assortative-mating strength on the measured trait).
`r_δ`	Cross-spouse correlation in additive genetic value; `= m · h²_obs` at AM equilibrium. The dashboard uses this directly (no fixed-point iteration) since Wilson h²(t) is already the equilibrium quantity.
`k`	AM-induced correlation between transmitted and non-transmitted alleles within parents; appears in the genetic-nurture variance identity (§3.3).
`V_A*`	Additive genetic variance at AM equilibrium; `V_A* = V_A / (1 − r_δ)` in the Crow–Felsenstein form. The dashboard observes V_A* directly via h²_obs and uses the formula to partition it into V(A_d) and V(A_LD).
`h²(t)`	Heritability as a function of age; logistic form `h²_∞ / (1 + exp(−k·(t − t_50)))` in §3.2. Earlier passes used a saturating exponential which fit the asymptote but rose too fast in childhood; the logistic is the smallest functional change that captures the empirical sigmoidal pattern.
`Σ`	Trait-level (within-sex or within-group) covariance matrix used in multivariate-D calculation.
`Mahalanobis D`	Multivariate generalization of Cohen’s d: `√(Δμᵀ Σ⁻¹ Δμ)`.
`ρ̄`	Average inter-trait correlation in `Σ`; the equicorrelated approximation collapses `D²` to `d²·n/(1+(n−1)ρ̄)` (§3.4).
`block-orthogonal`	Decomposition where major components are orthogonal to the residual environment but cross-terms within components (e.g. `Cov(A_d, A_i)`) are explicit, not zero.
`method gradient`	The relationship `twin h² ≥ WGS h² ≥ SNP h² ≥ within-family h²` driven by which components each estimator includes.

12. Cruxes for this model

The topology had cruxes for the field. This stage’s cruxes are different — they are the load-bearing assumptions of the formalization itself. If any one flips, the model needs to be restructured.

Crux	Load-bearing claim	What would flip it
C1	Within-family GWAS effect estimates are an unbiased estimate of `β_d`. The whole `A_d` / `A_i` separation depends on this.	A demonstration that within-family designs have a systematic confound (e.g., differential parental treatment that correlates with offspring genotype) that biases `β_d` by more than ~10%. So far Howe 2022 / Okbay 2022 within-sibship GWAS are mutually consistent and consistent with trio-based estimates, suggesting the confound is bounded.
C2	AM equilibrium has been reached or is close enough that the partition relation `r_δ = m·h²_obs` holds.	A demonstration that recent population-scale shifts in assortment (educational stratification expansion since 1970, online dating since 2010) have moved populations far from equilibrium for psychologically-relevant traits — at which point the observed `r_δ` would lag the formula’s prediction. Currently no direct evidence the partition is mis-calibrated; would require longitudinal `m`-by-cohort data.
C3	Hyperpolygenic architecture: `A_d = Σ β_k g_{ik}` over thousands of small effects, no single locus dominates.	Discovery that for a major psychological trait class, ~5–10 large-effect variants account for >50% of `V(A_d)`. Currently true only for the severe psychiatric tail (autism with ID, severe schizophrenia spectrum), where the model already concedes scope (§5.1). Would generalize to mainstream cognition only if a CRISPR-era discovery overturned the polygenic consensus.
C4	`A_d`, `A_i`, `A_LD` are jointly identifiable given the available designs.	A demonstration that twin/SNP/within-family/WGS estimators are not sufficient to disentangle all three (e.g., that AM-LD and rare-variant contributions are mutually confounded in a way no current design can break). This would force collapsing the decomposition or treating one component as a residual. Active concern: rare-variant heritability in WGS may itself be inflated by AM-LD among rare variants, which would muddy C4.
C5	Equicorrelated `Σ` is a useful approximation for the multivariate sex-difference module.	A demonstration that real personality covariance matrices have block-structured (or low-rank) `Σ` that produces qualitatively different `D` from the equicorrelated approximation. Already partially true: 16PF has known higher-order factor structure, which is why the equicorrelated approximation undershoots Del Giudice’s latent-variable result. Crux holds in a weakened form: equicorrelated is useful pedagogically but not quantitatively for high-dimensional panels.

The most consequential of these is C4. If A_d, A_i, and A_LD cannot be jointly identified by current designs, the variance decomposition reduces to a coarser partition (genetic-additive vs everything else), and the field-level dispute about how much “genetic” effect is environment-mediated remains parametrically unresolvable rather than just empirically pending.

Iteration history

Pass 5 2026-04-28

error check (math pedantry)

Why Final close-reading audit caught three small but real issues an academic reviewer would flag: §3.3 stated the AM coupling parameter k as "→ 1" when empirically it is in the 0.1–0.5 range for AM-strong traits, which overstated the cross-term 2·Cov(A_d, A_i); §12 Crux C2 still referred to "Crow–Felsenstein fixed point" though pass-4 had moved the dashboard to the partition formulation r_δ = m·h²_obs; §7 Objection 2 response said SNP h² "converges within ~30–50% of within-family twin estimates" with ambiguous phrasing.
- §3.3: replaced k → 1 with empirical k ∈ 0.1–0.5; added derivation 2·Cov = 2k·β_d·β_i·V(g) and showed that for EA with k ≈ 0.2, the cross-term is on the same order as V(A_i) itself (~0.024 each, not "much larger"); rewrote the EA worked example to reflect this
- §12 Crux C2: rewording from "fixed point applies" → "partition relation r_δ = m·h²_obs holds"; falsification path now refers to observed r_δ lagging the formula prediction under non-equilibrium AM
- §7 Objection 2 response: replaced "converges within ~30–50% of within-family twin estimates" with trait-specific numbers (height ~85%, cognition ~50–70%, EA ~30–40% of twin h²); the claim is now defensible against citation request
- After pass 5 the model is at the level of polish where further refinement would be diminishing returns. Stage ready for handoff to data pipeline.
Pass 4 2026-04-28

error checkcalibration audit

Why Stress-testing the dashboard at default load uncovered a real conceptual error in pass 2: the variance budget overflowed (twin h² output 1.19, SNP h² output 1.27 at cognitive/age=25/m=0.4/ratio_i=0.4) because the code interpreted the Wilson curve as a random-mating quantity that then got *inflated* by the AM factor. Wilson h²(t) is empirically what twin studies report, which is already the AM-equilibrium quantity; the AM factor should *partition* it into V(A_d) and V(A_LD), not scale it up. Two related calibration issues fell out of the same audit: the saturating-exponential Wilson form rises too fast in childhood (h²(5) output 0.52 vs empirical 0.20), and c²(t) had no asymptote so it decayed to ~0 in adulthood when the empirical floor for cognition is ~0.05.
- Reinterpreted Wilson h²(t) as the *observed* AM-equilibrium heritability (= V(A_d) + V(A_LD) by construction). Dashboard now uses the AM inflation factor to partition h² into V(A_d) (clean direct, what within-family designs estimate) and V(A_LD) (AM-LD), never to scale h² above its empirical value. Variance budget closes at 1.0 in every realistic case.
- Switched Wilson functional form from saturating exponential to logistic h²(t) = h²_∞ / (1 + exp(-k·(t-t_50))). New cognitive defaults (h²_∞=0.80, t_50=9, k_h=0.30) give h²(5) ≈ 0.19, h²(15) ≈ 0.69, h²(25) ≈ 0.79 — matching Bouchard 2013 within ~3pp across the developmental range. Saturating exponential overshot childhood h² by ~2.5×.
- Added c²_∞ asymptote: c²(t) = c²_∞ + (c²_0 - c²_∞)·exp(-k_c·t). Cognitive: c²_∞=0.05; personality: c²_∞=0; psychopathology: c²_∞=0.05. Documented that EA/religion/politics need c²_∞ ≈ 0.10-0.15 (substituted manually).
- Refactored V(A_i) from ratio_i·V(A_d) to ratio_i²·V(A_d) — the variance-level translation of a β-level ratio. For ratio_i=0.4, V(A_i)=0.16·V(A_d), not 0.4·V(A_d). The cross-term 2·Cov(A_d, A_i) ≈ 2·ratio_i·V(A_d) is the leakage path documented in §3.3 but not displayed as a separate bar segment to keep the budget clean.
- Dropped the fixed-point iteration in the dashboard since h²(t) is the equilibrium quantity directly. r_δ = m·h²_obs in one step.
- Updated §3.1 to distinguish forward (h²_rm → h²_eq, with fixed-point) from inverse/partition (h²_obs → V(A_d), V(A_LD), single equation) problems. Dashboard does the inverse problem.
- Updated §3.2 to logistic form with new parameter table; explicitly noted why the saturating exponential failed.
- Updated §3.3 to clarify β-level vs variance-level translation; documented why the dashboard displays only the V(A_i)=ratio_i²·V(A_d) slice and not the cross-term contribution.
- Updated §4 sanity-check anchors with the corrected numbers and added a fourth anchor: variance budget closes at 1.0 by construction.
- Updated TLDR sentence on Crow–Felsenstein from "fixed-point at AM equilibrium" to "partitions rather than inflates"; updated glossary entries for r_δ, V_A*, h²(t).
- Verified all sanity-check anchors numerically against the corrected dashboard logic — no overflow, no negative components, all budgets closing at 1.0.
Pass 3 2026-04-28

readabilityredundancy pruneconnectionsscope check

Why Pass-2 fixed technical errors but the document was still hard to enter for an educated lay reader (the TLDR loaded math notation cold), the §6.5 numbering was structurally awkward, the connection back to A3 (g exists) from the topology was missing, §4 duplicated what the dashboard already shows, and the Bedrock-Generating-Functions connection in §10 was hand-wavy enough that it weakened the rest of the connections section. Plus the glossary covered only a partial subset of the symbols the prose actually uses.
- Promoted §6.5 Adversarial+steelman to a proper §7; renumbered everything below (Open questions §8, Stage-4 handoff §9, Connections §10, Glossary §11, Cruxes §12)
- Added "How to read this stage" prelude after the dashboard mount — three short paragraphs of plain-language framing that explain what heritability is and is not, what the equation does, and how the reader should approach the rest of the document
- Added scalar-trait scope note in §2: P_i(t) is per-trait, not g-loaded. The topology assumption A3 (g exists) lives at the level of which composite/ability is being measured, not as a structural component of the decomposition. Multi-ability extension is an explicit future direction
- Compressed §4 from 18 lines to 6: dropped the verbose Inputs/Outputs spec (it duplicates the live dashboard) and kept only the three sanity-check anchors as calibration targets
- Generalized §3.4 explicitly: D² = (μ_A − μ_B)ᵀ Σ⁻¹ (μ_A − μ_B) applies to any two-group comparison (sex, cohort, occupation, clinical/control, urban/rural). Module is presented in sex-difference language because the framing trap concentrates there. L4 firewall does not block descriptive use across groups; only causal use
- Dropped Bedrock Generating Functions connection in §10 (analogy was too loose to do useful work). Strengthened Parent-to-Child Transmission and Evolution-Modernity Mismatch connections by tying each to specific 2024 findings (Nivard dynastic IGE, Pietschnig positive-manifold weakening) that the future topics need to address
- Glossary §11 expanded from 8 to 19 entries — adds P_i(t), A_d, A_i, A_LD, C, E_m, E_s, I, μ(t), k, V_A*, h²(t), ρ̄ in the order they appear in the generating function. Notes that field-level terminology (h², SNP, GWAS, etc.) is not duplicated from earlier glossaries
Pass 2 2026-04-28

error checkadversarial + steelmancrux identificationcompression

Why Pass-1 had three real technical errors and was missing two structural pieces. The method-gradient table over-claimed what twin h² captures (V(A_i) is shared identically by MZ and DZ co-twins so it lands in C under classical ACE, not A — leakage into A is via AM-related model misspecification, not by design). The Crow–Felsenstein formula was stated as a one-shot when it is a fixed-point. The genetic-nurture variance equation was written as ≈ when it is approximate at best. And the strongest objection to the whole formalization (variance bookkeeping is not the same as a causal mechanism) was not engaged head-on. Plus the 16PF Del Giudice preset in the dashboard misled by giving D ≈ 1.0 when Del Giudice reported 2.71.
- Method-gradient table rewritten: classical twin h² captures V(A_d) + V(A_LD); V(A_i) lands in C under correctly specified ACE, leaks into A under AM/genetic-nurture model misspecification — separated these explicitly
- Crow–Felsenstein r_δ ≈ m·h² flagged as a one-iteration approximation to a fixed-point; equilibrium r_δ* solves r_δ = m · h²(r_δ); approximation is tight at small r_δ, breaks at high m × h²
- Genetic-nurture variance equation softened: V(A_i) + 2·Cov(A_d, A_i) is approximated by, not equal to, V_PGS,population − V_PGS,within-family; the exact identity has β_i·k cross-terms that depend on AM coupling between transmitted and non-transmitted alleles
- Added §6.5 Adversarial + steelman — four objections (variance bookkeeping vs causal model, ACE assumptions are unrealistic, additive form misses dominance/epistasis, multivariate-D conflates measurement with reality), each with a steelman and the model's honest response
- Added §11 Cruxes for the model itself — five load-bearing claims (within-family GWAS validity, AM equilibrium approximation, hyperpolygenic architecture, identifiability of A_d/A_i/A_LD, equicorrelation as a useful Σ approximation) and what evidence would flip each
- Renamed dashboard 16PF preset to "16PF observed" with an explanatory note that reaching D=2.71 requires latent-variable modeling with disattenuation, not the equicorrelated approximation
- Twin h² card in dashboard relabeled to "Twin h² (classical ACE)" with the corrected formula A_d + A_LD; helper text below the method gradient updated to match
- Compressed TLDR para 2 (V(P) cross-term mention now matches body) and tightened §3.1 worked example by removing redundancy
Pass 1 2026-04-28

decompositiongenerating functionintegrationgap scan

Why First draft of the formalization. Pulled the spine equation, AM inflation, Wilson curve, and multivariate-D algebra out of the topology handoff and wrote them as a single coherent generating function. Built the interactive dashboard so the reader can dial parameters across both modules.
- Wrote master equation: P_i = A_d + A_i + A_LD + C + E_m + E_s + I + μ(t)
- Decomposed V(P) with explicit Cov(A_d, A_i), Cov(A_d, E_m), Cov(A_d, C) cross-terms (block-orthogonal, not orthogonal)
- Closed-form 1: Crow–Felsenstein V_A* = V_A / (1 − r_δ), with r_δ ≈ m·h²
- Closed-form 2: Wilson saturation h²(t) = h²_∞ − (h²_∞ − h²_0)·exp(−kt)
- Closed-form 3: Genetic-nurture additive split, β_i/β_d ≈ 0.3–0.5 for EA
- Module B: Mahalanobis D² = (μ_F − μ_M)ᵀ Σ⁻¹ (μ_F − μ_M); equicorrelated case D² = d²·n/(1+(n−1)ρ̄)
- Method gradient identities: twin h² ≥ SNP h² ≥ within-family h² with each estimator picking up a different subset of components
- Boundary conditions: severe psychiatric tail, L4 between-pop firewall, environmental thresholds, AM equilibrium, individual-level (L1)
- Distortion-aware reading: term-by-term failure modes for each component
- Interactive dashboard with two tabs (variance decomposition + multivariate sex-difference)
Pass 6 2026-04-29

error checkcross-stage consistency

Why A reviewer caught a real and consequential error in pass 5's framing of section §2.2 and §3.1. The method-gradient table claimed Falconer's twin formula `2·(rMZ − rDZ)` estimates `V(A_d) + V(A_LD)` cleanly. That is true ONLY under random mating. Under positive AM, fraternal twins share more than 50% of trait-relevant alleles (because their parents are genetically more similar than chance), which raises rDZ relative to rMZ and biases Falconer downward by factor (1 − m_A). So Falconer estimates `V(A_AM)/V(P) · (1 − m_A)`, not `V(A_AM)/V(P)`. The section §3.1 partition formula `V(A_LD) = m·h²_obs` is mathematically valid as a Crow-Felsenstein population-level decomposition of V(A) at AM equilibrium, but it was being applied as if h²_obs equaled the twin estimate, which conflates two quantities biased in opposite directions. The corrected reading: AM is real at the population level (V(A) inflation via LD; Yengo 2018 measures 14–23% V(A_LD)/V(A) for height empirically, matching the formula prediction) but it does NOT explain the gap between twin h² and within-family h² for socially-structured traits — that gap is dominated by genetic nurture and equal-environments-assumption violations, partially OFFSET by AM's downward bias on Falconer.
- Added clarifying note in §2.2 after the method-gradient table explaining Falconer's downward bias under AM and the corrected interpretation of the twin-vs-within-family gap (genetic nurture + EEA violations dominate, AM partially offsets)
- Added clarifying note in §3.1 after the partition formula explaining that h²_obs there represents AM-equilibrium V(A)/V(P), with different estimators recovering this with different biases (SNP-based unbiased, Falconer biased downward by AM with partial upward offset from EEA / genetic nurture)
- Updated the §3.1 worked anchors with caveats acknowledging the Falconer-vs-SNP discrepancy: for EA, applying the formula to twin h² gives a different absolute V(A_LD) than applying it to SNP h²; for height the discrepancy is smaller because EEA + genetic-nurture biases on Falconer are smaller for height than for socially-structured traits
- Preserved the cross-trait AM (Border 2022) result independently — that is about between-trait LD inflating reported cross-disorder rg, a separate and well-supported phenomenon
- Did not change the dashboard logic: re-deriving the bucket numbers under the corrected interpretation would require either (a) substituting SNP-based h² for Wilson-fit twin h² as input, or (b) explicitly modeling Falconer's AM-bias correction. Both are larger changes than this corrective pass aims for. The dashboard's outputs should now be read with the §2.2 / §3.1 caveats in mind. Same for the empirical numbers cited in the prose
Pass 7 2026-04-29

internal consistency checkerror check

Why On a careful re-read after pass 6 added clarifying notes to §2.2 and §3.1 about the AM downward bias on Falconer, two more places in the model still carried the old wrong-direction framing or stale numbers. (a) §6 distortion-aware reading's row for V(A_LD) said "Inflates V(A_d) by ~10–25% in twin studies" — same wrong-direction error the friend caught originally. (b) §7 Objection 2's response cited SNP-h² recovery numbers ("for height ~85%, for cognition ~50–70%, for EA ~30–40%") that I had already corrected in the writeup at pass 3 (cognition is actually 25–40% recovery, height is 60% common-SNP / 80% WGS, EA is 30–50%) but never propagated back to the model. Both were left over from earlier passes.
- §6 distortion-aware reading V(A_LD) row rewritten: "Inflates V(A_d) by ~10–25% in twin studies; a chunk of \"genetic\" effect is structural, not biological" → "Inflates V(A) at the population level by ~10–25% via AM-induced LD between trait-relevant alleles (Yengo 2018: 14–23% for height, matching the formula prediction). Does NOT on net inflate Falconer twin h² — AM actually biases Falconer downward, partially offsetting other classical-ACE biases (see §2.2 caveat)."
- §7 Objection 2 SNP-h² recovery numbers updated to match writeup pass 3 (and primary sources): "for height ~85%, for cognition ~50–70%, for EA ~30–40%" → "for height about 60% with common SNPs alone (rising to ~80% with whole-genome sequencing that captures rare variants), for cognitive ability about 25–40%, for educational attainment about 30–50%"
Pass 8 2026-04-29

internal consistency check

Why Passes 6-7 added clarifying notes to §2.2, §3.1, §6, and §7 about the AM-direction error, but the dashboard React component (PsychVariationModel.tsx) still had the dashboard prose displaying alongside the variance partition that called V(A_LD) "structural inflation from non-random mating." This is the user-facing prose every visitor sees when they interact with the model dashboard — the most-visible piece of model-stage content — and it propagated the wrong-direction framing. Final cross-pipeline grep caught it.
- Dashboard prose rewritten: was "Wilson h²(t) is the *observed* (AM-equilibrium) heritability twin studies estimate. AM-LD partitions it into V(A_d) (clean direct, what within-family designs estimate) and V(A_LD) (structural inflation from non-random mating). V(A_i) is added on top as the variance contribution of genetic nurture; classical ACE without AM correction tends to leak some of V(A_i) into A..."; now "Wilson h²(t) is the AM-equilibrium population heritability V(A_AM)/V(P). The Crow-Felsenstein partition splits V(A_AM) into V(A_d) (clean direct, what within-family designs estimate) and V(A_LD) (population-level linkage among trait-relevant alleles induced by non-random mating). Note that classical twin h² (Falconer) is *biased downward* relative to V(A_AM)/V(P) by factor (1 − m_A) ... but is typically inflated upward by EEA violations and genetic-nurture leakage, with the net effect for socially-structured traits being upward overall. ... the gap between empirical twin h² and within-family h² for socially-structured traits is dominated by genetic nurture and EEA, not AM (see the model stage §2.2 caveat)."
- Build clean. The model dashboard now displays prose consistent with the §2.2 / §3.1 / §6 / §7 corrections