Model
Generating function for human psychological variation. One equation per person; variance decomposition follows. Closed-form pieces: Crow–Felsenstein AM inflation, Wilson-Effect saturation, genetic-nurture additive split, multivariate sex-difference Mahalanobis D. Twin / SNP / within-family heritability are projections of the same decomposition. Interactive dashboard included.
TLDR
The topology answered “what depends on what?”. The formalization answers a sharper question: given a person, where does their phenotype come from in expectation? The answer is a single generating function that, once written down, dissolves several apparent paradoxes in the field — most importantly the gap between twin heritability, SNP heritability, and within-family heritability (they estimate different sums of the same underlying components, and the differences are informative).
The spine of this stage is one equation. Phenotype P for a person in a population is P = A_d + A_i + A_LD + C + E_m + E_s + I, with each term a contribution from a distinct mechanism: direct genetic effects from the person’s own transmitted alleles, indirect genetic effects from parental (and broader-family) genomes operating through the environment they create, assortative-mating-induced linkage among causal variants, residual shared environment, measured non-shared environment, stochastic developmental noise, and gene-environment interaction terms. Variance decomposition follows directly, and is block-orthogonal rather than fully orthogonal: V(P) = ΣV(component) + 2·Cov(A_d, A_i) + 2·Cov(A_d, E_m) + 2·Cov(A_d, C) + V(I). The cross-terms are the formal home of every gene-environment correlation finding in the literature; pretending they are zero is the most common modeling error. Three closed-form pieces drop out — the Crow–Felsenstein assortative-mating partition V(A_LD) = h²_obs · r_δ with r_δ = m·h²_obs (the dashboard partitions h² rather than inflating it; the equilibrium is reached in 5–10 generations of stable assortment), the Wilson-Effect logistic curve h²(t) = h²_∞ / (1 + exp(−k·(t − t_50))), and the method gradient that says twin h² ≥ SNP h² ≥ within-family h² with the gaps decomposable into AM-LD, indirect-genetic, and rare-variant pieces.
A second module handles the multivariate sex-difference algebra, because the single largest framing trap in this field is the gap between univariate Cohen’s d (typically 0.2–0.6 across personality dimensions) and the multivariate Mahalanobis distance D² = Δμᵀ·Σ⁻¹·Δμ (which can hit 2.7 when traits are weakly correlated and you stack 15 of them, as in Del Giudice 2012). The same data, two numbers, opposite-sounding stories — both correct. The formalization makes the bridge explicit so the reader can dial univariate d’s and inter-trait correlations and watch D move.
What this stage does not formalize: the Plomin/Turkheimer interpretation of polygenic scores (verbal disagreement, no candidate equation), the mechanism behind the Gender Equality Paradox (three live hypotheses with no shared formalism), and the magnitude of AM-correction across the full cross-disorder genetic-correlation matrix (active research, methods just emerging). These remain at the observation stage; premature math here would mask uncertainty rather than reduce it. The L4 Lewontin firewall is preserved as a structural property of the model: the entire generating function is within-population, and nothing in it licenses between-population mean inference.
Inputs
Variance decomposition
Method gradient
Assortative-mating partition
Wilson h²(t) is the AM-equilibrium population heritability V(A_AM)/V(P). The Crow-Felsenstein partition splits V(A_AM) into V(A_d) (clean direct, what within-family designs estimate) and V(A_LD) (population-level linkage among trait-relevant alleles induced by non-random mating). Note that classical twin h² (Falconer) is *biased downward* relative to V(A_AM)/V(P) by factor (1 − m_A) — AM raises DZ correlation relative to MZ correlation — but is typically inflated upward by EEA violations and genetic-nurture leakage, with the net effect for socially-structured traits being upward overall. V(A_i) is added on top as the variance contribution of genetic nurture; the gap between empirical twin h² and within-family h² for socially-structured traits is dominated by genetic nurture and EEA, not AM (see the model stage §2.2 caveat).
How to read this stage
The dashboard above is the artifact. Everything below is the spec.
In plain language: when researchers report a “heritability of 0.50,” what is being claimed is that if you took the variance in a trait across a population and asked how much of it tracks genetic differences, half of it does. It is a statement about the population’s variance, not about any single person, and not about between-population differences. It says nothing causal beyond that — high heritability is fully compatible with large environmental effects (height is ~80% heritable and has risen ~10cm in a century). Different methods estimating “heritability” answer slightly different questions: twin studies pick up the broadest definition, within-family designs the narrowest. The gap between them is informative.
The stage formalizes that picture by writing one equation per person, decomposing it into named pieces, and showing how each measurement method projects onto a different subset of the pieces. Three closed-form sub-equations follow (assortative-mating inflation, Wilson-Effect age curve, genetic-nurture split). A second module addresses the same algebra applied to group differences — most prominently sex differences, but the framework is general. The dashboard lets you turn the knobs and watch the consequences.
You can read this top-down (TLDR → equation → closed forms → boundary conditions) or bottom-up (play with the dashboard, then come back to the equations when something surprises you). Either order works. The cruxes section at the end (§12) is where the load-bearing assumptions live; if any one fails, parts of the picture have to be rebuilt.
1. Move I’m making
This stage is a decomposition + generating function + integration, in that order:
- Decomposition — orthogonalize phenotypic variance into mechanism-specific components, with explicit non-orthogonal
Cov(G,E)and interaction terms as the principled exceptions. - Generating function — write the per-person phenotype as a deterministic function of those components plus stochastic noise. The variance decomposition follows by taking
V(·)of the generating function. - Integration — show that twin, SNP, and within-family heritability estimators are projections of the same underlying decomposition onto different observable subspaces. The Wilson Effect, AM inflation, and genetic-nurture findings then read as motion of those projections, not as separate phenomena.
What’s not ready: anything in the topology marked O (open), and the polygenic-score causal-vs-summary debate, where the underlying disagreement isn’t yet a formal one.
2. The generating function
For a single person i in a population at developmental time t, sampled from a stable mating regime:
P_i(t) = A_d,i + A_i,i + A_LD,i + C_i + E_m,i + E_s,i + I_i + μ(t)
| Term | Mechanism | Source identity |
|---|---|---|
A_d | Direct genetic — additive effect of person’s own transmitted causal alleles, evaluated as if mating were random | Σ_k β_k · g_{ik} over causal SNPs k |
A_i | Indirect genetic (genetic nurture) — additive effect of parents’ (and extended-family) genotypes operating through the rearing environment | parents’ PGS × environmental transmission coefficient |
A_LD | Assortative-mating LD inflation — additional additive variance induced by linkage among causal variants from non-random mating | At AM equilibrium, V(A_d) + V(A_LD) = h²_obs; the partition is V(A_LD)/h²_obs = r_δ. |
C | Shared environment residual — environmental effects shared by siblings not already captured by A_i. Adult personality: ~0. Education / religiosity / politics: nonzero | |
E_m | Measured non-shared environment — identifiable causes (lead, schooling, head injury, peer composition, nutrition) | each enters with a measured causal coefficient, e.g. lead: β ≈ −6.2 IQ pts per 1–10 µg/dL |
E_s | Stochastic developmental noise — unmeasured non-shared variance: developmental contingencies, immune/microbial, microscale neural variation, measurement error | the unmodeled residual; ~50% of personality variance |
I | Interaction terms — G×E, G×G (epistasis), G×age. As of 2025 evidence, generally small at PGS-by-environment scale; large only at extreme environmental insults | residual non-additivity |
μ(t) | Population mean at age t — not a person-level term but the developmental trajectory the person grows through | calibrated to age-norm tables |
Why this form: this is the additive-decomposition default of quantitative genetics extended with the two corrections that the 2018–2025 literature has installed into the field — separating A_d from A_i (Kong 2018, Young 2022) and separating A_d from A_LD (Border 2022, Yengo 2018, Wainschtein 2025). Earlier formulations folded A_i into A_d and A_LD into A_d and got the wrong answer about how much of the population-level genetic signal is direct biological causation. The within-family literature is what made these terms separately estimable.
Scope note — scalar trait, not g-loaded vector: P_i(t) is written as a scalar for one trait at a time. For cognitive ability, this collapses an underlying multi-ability structure (g + specific abilities, the CHC hierarchy) into a single phenotypic measure. The collapse is faithful when reporting g-loaded composite scores (e.g., full-scale IQ), and reasonable for any single primary ability. It is not faithful when the question is “how much of A_d for cognition is g versus specific abilities” — that requires a multivariate extension where each ability gets its own decomposition and g enters as a latent common factor across them. The topology’s foundational assumption A3 (g exists as a real dimension) lives at this level: the model below operates inside a single ability/composite and inherits g as a property of which ability is being measured rather than as a structural component. For sex differences (Module B, §3.4), the multivariate extension is necessary by construction; that’s why it appears as a separate module.
2.1 Variance decomposition
Taking variance of the generating function and tracking the cross-terms:
V(P) = V(A_d) + V(A_i) + V(A_LD)
+ V(C) + V(E_m) + V(E_s)
+ 2·Cov(A_d, A_i) ← genetic nurture is correlated with direct effects (parents pass both)
+ 2·Cov(A_d, E_m) ← active rGE: people select environments matching propensities
+ 2·Cov(A_d, C) ← passive rGE residual (small once A_i is split out)
+ V(I)
The off-diagonal Cov terms are why “orthogonal decomposition” is the wrong frame for this system. The system is block-orthogonal: the additive components are roughly orthogonal to the residual environment but not to each other, and the cross-terms are the formal home of every gene-environment correlation finding in the literature. Pretending they’re zero is the single most common modeling error.
2.2 Heritability identities
Three quantities are estimable from data; each picks up a different subset of the variance terms. The mapping is more subtle than a casual reading of the literature suggests, and it is worth getting right because the public-discourse confusion about “twin studies overestimate” turns on this exact algebra.
The non-obvious point about twin h²: V(A_i) (genetic nurture) is shared identically by MZ and DZ co-twins, because they share the same parents. Under a correctly specified ACE model, this variance lands in C, not A. So a faithful classical twin model does not count genetic nurture as heritability. The empirical observation that twin h² > within-family h² (e.g., for EA: 0.40 vs ~0.15) is therefore not due to twin h² capturing A_i directly. It is mostly due to two model-misspecification leakages: the ACE assumption rDZ_A = 0.5 fails under assortative mating (true sibling additive correlation under AM is 0.5·(1+r_δ)), and the assumption that genetic nurture’s contribution is fully shared between siblings can fail if parents differentially treat MZ vs DZ pairs.
| Estimator | What it estimates (correctly specified) | Practical leakages |
|---|---|---|
Twin h² (classical ACE: 2·(rMZ − rDZ)) | V(A_d) + V(A_LD) | Under unmodeled AM, some V(A_i) and V(C) bleed into A. Empirically, classical twin h² for EA exceeds within-family by ~0.20–0.25. |
| SNP h² (GREML, LDSC on population GWAS) | V(A_d, common) + V(A_LD, common) + V(A_i, common)·attenuated | Population GWAS effect sizes β_pop = β_d + k·β_i (where k is the AM coupling between transmitted and non-transmitted alleles), so SNP h² is inflated by some V(A_i), but attenuated relative to the full V(A_i) because k < 1. Excludes rare variants. |
| WGS h² (Wainschtein 2025) | SNP h² + V(A_d, rare) + V(A_LD, rare) | Closes the rare-variant gap; same A_i contamination as SNP h² unless within-family. |
| Within-family h² (sib-FE, MZ-discordant, parent-offspring trios) | V(A_d) | Removes A_i and A_LD cleanly; leaves direct additive only. With WGS: V(A_d) + V(A_d, rare). |
This is the method gradient (S2 in the topology):
twin h² ≥ WGS h² ≥ SNP h² ≥ within-family h²
The gaps are not measurement error. They are the data’s way of telling you how much of “heritability” is structural (AM-LD), how much is environmental-via-parents (A_i), and how much depends on rare variants common-variant arrays cannot tag.
For educational attainment in 2025: classical twin h² ≈ 0.40, common-variant SNP h² ≈ 0.20–0.25, WGS h² ≈ 0.30 (with rare-variant contribution), within-family additive ≈ 0.15.
Important caveat on Falconer’s AM bias (added in pass 6 after a reviewer correction). The “What it estimates” column above is exact only under random mating. Under positive AM, Falconer’s formula 2·(rMZ − rDZ) is biased downward by factor (1 − m_A) where m_A ≈ m·h² — fraternal twins share more than 50% of trait-relevant alleles because their parents are genetically more similar than chance, raising rDZ relative to rMZ and shrinking the formula’s output. So Falconer estimates [V(A_d) + V(A_LD)] · (1 − m_A) / V(P), not V(A_AM)/V(P) directly.
This matters for interpreting the gap. When classical twin h² > within-family h² for socially-structured traits (EA: 0.40 vs 0.15), the gap is dominated by other classical-ACE biases — primarily the equal-environments assumption (MZ co-twins are treated more similarly than DZ co-twins, inflating MZ correlation) and genetic-nurture leakage (V(A_i) leaks into A under model misspecification rather than landing cleanly in C) — partially offset by the AM downward bias. AM is a real phenomenon at the population level (Crow-Felsenstein V(A_LD) inflation; see §3.1 below) but it does not, on net, drive the twin-vs-within-family gap. The dominant inflation source is genetic nurture and EEA, with direct empirical anchors in Kong 2018 (non-transmitted PGS effect = 29.9% of transmitted for EA) and Okbay 2022 EA4 (within-family direct ~50% of population PGI). Within-family designs control for AM, EEA, and genetic nurture simultaneously.
This is the single calculation a careful reader of “twin studies vs molecular studies” headlines should be able to do. The numbers don’t disagree; they answer different questions.
3. Closed-form pieces
Three components admit clean equations. The rest are calibrated empirically.
3.1 Assortative-mating inflation (Crow–Felsenstein)
There are two ways to use the AM-inflation formula, and they answer different questions.
Forward problem (rarely the relevant one): given the random-mating heritability h²_rm of a trait, what is the equilibrium heritability after stable AM? The answer is a fixed-point coupling r_δ = m · h²*, V_A* = V_A / (1 − r_δ), h²* = V_A* / (V_A* + V_E), reached in ~5–10 generations of stable assortment (Crow & Felsenstein 1968). One-iteration approximation: r_δ ≈ m · h²_rm, inflation ≈ 1 / (1 − m·h²_rm).
Inverse / partition problem (what the dashboard does): given the AM-equilibrium population additive variance V(A_AM)/V(P) = h²_obs, partition it into the random-mating-equivalent direct component V(A_d) and the AM-induced LD inflation V(A_LD):
r_δ ≈ m · h²_obs
V(A_d) = h²_obs / (1 + r_δ + r_δ² + …) = h²_obs · (1 − r_δ)
V(A_LD) = h²_obs − V(A_d) = h²_obs · r_δ
The Wilson curve gives h²_obs(t) directly, so the partition uses r_δ = m · h²_obs(t) with no iteration needed. This is what the dashboard implements. Pass-2 versions of the dashboard erroneously inflated h²_obs on top of itself, pushing twin h² above 1.0 at high parameter values; pass-4 corrected this.
Note on what h²_obs should represent here (added in pass 6 after a reviewer correction). The partition formula is a clean population-level decomposition of V(A_AM). Different estimators recover V(A_AM)/V(P) with different biases: SNP-based heritability (GREML / LDSC on unrelated individuals) recovers it approximately unbiased; classical twin h² (Falconer) recovers V(A_AM)/V(P) · (1 − m_A) — biased downward by AM, partially offset upward by EEA violations and genetic-nurture leakage. The dashboard’s Wilson-fit twin estimates conflate these biases. The partition formula’s empirical validation is the match against SNP-based AM-LD estimates: Yengo 2018 measures V(A_LD)/V(A) for height at 14–23% empirically, matching the formula’s prediction of m·h² = 20%. For socially-structured traits where Falconer twin h² is itself substantially inflated by EEA + genetic nurture, applying the partition formula to twin h² over-attributes the partition share to AM relative to its true population-level magnitude.
Worked anchors:
- Educational attainment with
m ≈ 0.4,h²_obs ≈ 0.40(twin) →r_δ ≈ 0.16,V(A_d) ≈ 0.34,V(A_LD) ≈ 0.06. Caveat: theh²_obs ≈ 0.40here is the Falconer twin estimate, which is a biased proxy for the AM-equilibrium V(A)/V(P); applying the partition to the SNP-based estimate (~0.13) would give a smaller absolute V(A_LD). - Height with
m ≈ 0.25,h²_obs ≈ 0.85→r_δ ≈ 0.21,V(A_d) ≈ 0.67,V(A_LD) ≈ 0.18— matches the 14–23% empirical “AM-inflated” share Border et al. and Yengo et al. report (this is the trait where the partition’s empirical validation is cleanest, because Falconer-bias vs SNP-h² discrepancies are smaller for height than for socially-structured traits).
Cross-trait AM (m_xy ≠ 0) extends the same logic to off-diagonal entries of the genetic-covariance matrix and is the formal reason E7 finds R² = 0.74 between phenotypic-cross-mate correlations and genetic-correlation estimates. The cross-trait AM result (Border 2022) survives independently of the within-trait Falconer-bias issue: it’s about between-trait LD inflating reported genetic correlations between disorders, which is empirically validated and not in dispute.
3.2 Wilson-Effect saturation curve
Heritability of cognitive ability rises with age because active rGE (G1) compounds: as children gain agency, they select environments matching their genetic propensities, amplifying genetic variance and shrinking shared environment. The empirical age curve from Bouchard 2013 and Briley & Tucker-Drob 2013 is sigmoidal — slow rise in early childhood, fastest gain in late childhood / early adolescence, saturation in late adolescence. A logistic gives a clean three-parameter fit:
h²(t) = h²_∞ / (1 + exp(−k_h · (t − t_50)))
With h²_∞ ≈ 0.80, t_50 ≈ 9 years (age at half-asymptote), and k_h ≈ 0.30/year: h²(5) ≈ 0.19, h²(10) ≈ 0.46, h²(15) ≈ 0.69, h²(25) ≈ 0.79. These match Bouchard’s anchors within ~3 percentage points across the full developmental range.
(Earlier passes used a saturating-exponential h²_∞ − (h²_∞ − h²_0)·exp(−k·t), which rises too fast at the young end — it produced h²(5) ≈ 0.52 for cognition vs the empirical ~0.20. The logistic form is the smallest functional change that fits the empirical sigmoidal pattern.)
The shared-environment trace runs an inverse path with a non-zero asymptote, since shared environment for cognition does not actually drop to zero in adulthood (~0.05 plateau is well-attested):
c²(t) = c²_∞ + (c²_0 − c²_∞) · exp(−k_c · t)
Cognition: c²_0 ≈ 0.50, c²_∞ ≈ 0.05, k_c ≈ 0.15/year. For Big Five personality, c²_∞ ≈ 0 is appropriate (shared family environment effectively vanishes for personality by adulthood). For educational attainment and religiosity, c²_∞ ≈ 0.10–0.15 should be substituted — these are exception traits where shared environment persists throughout life.
Both formulas are phenomenological — the parameters are not derived from a deeper model. They are calibration knobs for the dashboard.
3.3 Genetic-nurture decomposition (additive form)
Define g_T as the offspring’s transmitted-allele PGS and g_NT as the parental non-transmitted-allele PGS. Then:
A_d = β_d · g_T
A_i = β_i · g_NT
Empirically (Kong 2018, Wang 2021, Okbay 2022, Howe 2022):
β_i / β_d ≈ 0.3 – 0.5 (educational attainment)
β_i / β_d ≈ 0.0 – 0.1 (height, BMI)
β_i / β_d ≈ 0.4 – 0.6 (cognitive performance)
β-level vs variance-level. The ratio β_i/β_d quoted above is at the regression-coefficient level. Translating to a variance contribution requires squaring (for the pure variance term) and an explicit cross-term:
V(A_i) = β_i² · V(g) = (β_i/β_d)² · V(A_d)
2·Cov(A_d, A_i) = 2·k · β_d · β_i · V(g) = 2·k · (β_i/β_d) · V(A_d)
Where k is the AM-induced correlation between an offspring’s transmitted-allele PGS and the parental non-transmitted-allele PGS. Under random mating k ≈ 0 (Mendelian segregation makes them independent). Under stable AM, k > 0 because spousal phenotypic correlation creates correlation between mom-transmitted alleles and dad-non-transmitted alleles (and vice versa); for AM-strong traits (EA, height) k is empirically in the 0.1–0.5 range, depending on the strength and stability of assortment.
This means 2·Cov(A_d, A_i) is not generally larger than V(A_i). For EA with β_i/β_d ≈ 0.4, V(A_d) ≈ 0.15, and a moderate k ≈ 0.2: V(A_i) ≈ 0.024, 2·Cov ≈ 2·0.2·0.4·0.15 = 0.024. Total genetic-nurture variance contribution ≈ 0.048 — modest, on the same order as V(A_i) itself.
The dashboard displays only the pure V(A_i) = (β_i/β_d)² · V(A_d) slice as a clean variance bucket. The cross-term 2·Cov(A_d, A_i) is the leakage path that makes empirical twin h² (under unmodeled AM) exceed the dashboard’s “Twin h² (ACE)” output. It is acknowledged in the help text rather than allocated to a separate bar segment, partly because k is poorly constrained empirically and partly because adding a cross-term slice would over-clutter the visualization without changing the qualitative picture.
The relation V(A_i) + 2·Cov(A_d, A_i) ≈ V_PGS,population − V_PGS,within-family is approximate but useful — it turns “missing heritability after within-family correction” from a puzzle into an order-of-magnitude measurement. The exact RHS is (β_i/β_d) · V(A_d) · (2k + (β_i/β_d)), which depends on k and degrades to small values when AM is weak.
3.4 Multivariate sex-difference algebra (Module B)
For a trait vector x with covariance matrix Σ and group means μ_F, μ_M, the multivariate effect size is the Mahalanobis distance:
D² = (μ_F − μ_M)ᵀ · Σ⁻¹ · (μ_F − μ_M)
For uncorrelated traits with equal univariate effect sizes |d|, D² = n·d² so D = d·√n. For correlated traits, the inverse covariance structure either amplifies or shrinks D depending on whether sex-difference vectors are aligned with high-variance or low-variance directions of Σ.
Worked example. Take 15 personality dimensions (16PF), univariate |d| ≈ 0.5 on average, with positive inter-trait correlations averaging ρ ≈ 0.20. Then approximately:
D² ≈ d² · 1ᵀ · Σ⁻¹ · 1
≈ d² · n / (1 + (n − 1)·ρ̄) if Σ has a constant-correlation structure
≈ 0.25 · 15 / (1 + 14·0.20)
≈ 0.25 · 3.95
≈ 0.99
D ≈ 1.0
The equicorrelated approximation undershoots Del Giudice 2012’s reported D = 2.71, and this gap is informative rather than a bug. To recover 2.71 in the equicorrelated form would require average univariate |d| ≈ 1.3, far above what 16PF or NEO papers report at the observed level. What Del Giudice actually did was use multigroup latent-variable modeling with measurement-error disattenuation: he corrected each factor’s d for unreliability and then computed D on the latent (true-score) means. Disattenuation magnifies effect sizes when reliability is well below 1.0, and aggregating across 15 factors then compounds the magnification. The honest summary is: at the level of observed (raw, pre-disattenuation) measurement, multivariate sex-difference D for personality is ~1.0–1.5; Del Giudice’s 2.71 is the latent-true-score analogue.
The intuition behind the algebra still holds: if men and women differ on dimensions that are weakly correlated with each other, every dimension contributes independent information, and D grows with √n. If they differ on highly correlated dimensions, the differences carry redundant information and D plateaus. But the gap between observed and disattenuated D is itself a substantive piece of the field’s debate — and worth flagging rather than papering over.
Why this matters for distortions. D3 (the “gender similarities” framing) cites univariate d ≈ 0.05 for math performance and reads it as evidence of broad similarity. D4 (pop-evpsych framing) cites multivariate D ≈ 2.71 and reads it as evidence of broad difference. Both citations are correct. The bridge equation shows that they are about different objects: a single dimension vs. a 15-dimensional space. Anyone who hasn’t internalized this algebra can be silently captured by either framing.
The algebra is general — not just for sex. D² = (μ_A − μ_B)ᵀ · Σ⁻¹ · (μ_A − μ_B) applies to any two-group comparison: sex, age cohort, occupational sample, clinical vs. control, urban vs. rural — anywhere group means are reported on a multivariate panel. The module is presented in sex-difference language because that is where the framing trap concentrates, but readers thinking about other group comparisons can use the same dashboard. The L4 firewall (§5.2) does not block this generalization at the within-population level; it only blocks the leap from within-population variance/distance estimates to between-population causal claims. A descriptive D between two samples is fine; a causal interpretation of that D requires assumptions the model does not provide.
3.5 PGS portability decay (deferred)
Topology Variant C: accuracy(distance) calibration from Ding et al. 2023 (r = −0.95 between genetic distance and PGS R² across 84 traits) is a clean candidate for closed-form. Deferred to a future tool because it sits at the population-genetics boundary rather than the within-population generative process this stage formalizes. Listed as a follow-up.
4. Composing the parts: anchors the dashboard preserves
The dashboard above stitches §3.1, §3.2, and §3.3 into one panel — sliders for trait class, age, m, β_i/β_d, and rare-variant share; outputs the variance decomposition and the three method-specific h² numbers. Four sanity-check anchors hold under the calibrated defaults:
- IQ at age 5 (cognitive): h²(5) ≈ 0.18, V(C) ≈ 0.26, V(A_d) ≈ 0.17. Matches Bouchard 2013.
- IQ at age 25 (cognitive, m=0.4, β_i/β_d=0.4): h²(25) ≈ 0.79, V(A_d) ≈ 0.54, V(A_LD) ≈ 0.25, V(A_i) ≈ 0.09, V(C) ≈ 0.06. Within-family h² (= V(A_d)) ≈ 0.54 — about a third more than the often-quoted EA within-family of 0.15, because cognition is a higher-h² trait than education.
- Big Five across adulthood: h² ≈ 0.45, V(C) ≈ 0, V(E) ≈ 0.55. Effectively flat from age 5 onward.
- Variance budget closes: V(A_d) + V(A_LD) + V(A_i) + V(C) + V(E) = 1.0 by construction. Twin h² never exceeds the Wilson asymptote.
These are the calibration targets. The biggest non-obvious one is anchor 4: the previous dashboard pass had the variance budget overflow under default parameters (twin h² > 1.0 at age 25 with m=0.4), which was a real bug. The current partition h²_obs = V(A_d) + V(A_LD) keeps the budget bounded by construction.
For traits the dashboard does not have a dedicated class for (educational attainment, height, religiosity, political affiliation), the user can approximate by choosing the closest class and adjusting sliders. EA-like behavior emerges from cognitive with m=0.4 and a mental note that h²(25) for EA is closer to 0.40 than 0.79 — i.e., the dashboard’s cognitive class is calibrated to IQ, not EA.
5. Boundary conditions and where the model breaks
The generating function is correct only inside its scope. Five boundaries are explicit:
-
Severe psychiatric tail. The hyperpolygenic
A_d = Σ β_k g_{ik}form assumes thousands of small effects. For early-onset autism with intellectual disability, single rare variants (CHD8, SCN2A) can carry effects of d > 1.0. The decomposition still works component-by-component butA_dbecomes dominated by a small number of large-effect alleles — effectively Mendelian rather than polygenic. The model should either widen its prior on individualβ_kor hand off to a separate Mendelian module at the tail. -
Between-population mean differences (L4 firewall). Every term in the generating function is defined within a population at a stable mating regime. The model is structurally silent on between-population means: there is no
μ_popterm to compare. ComputingD² = (μ_pop1 − μ_pop2)ᵀ Σ⁻¹ (μ_pop1 − μ_pop2)is mathematically possible but requires assumingΣ_pop1 = Σ_pop2and equal causal architecture across populations — neither of which is empirically supported (Ding 2023’s PGS-portability collapse is the empirical evidence that the assumption fails). This is the L4 / Lewontin firewall encoded directly into model scope. -
Severe environmental insults.
V(I)(interaction) is small at PGS-by-environment scale but large when environments cross threshold (lead, alcohol, severe deprivation, iodine). The additive decomposition under-fits at thresholds. Use the model in the normal range; switch to an explicit threshold-effect model at the extreme. -
Non-equilibrium AM. The Crow–Felsenstein formula assumes AM has reached equilibrium. For populations under rapidly changing assortment regimes (e.g. rapid shifts in educational stratification), the inflation factor is en-route to the equilibrium value, not at it. Use the formula as an upper bound under those conditions.
-
Individual-level inference (L1).
V(A_d)is a population variance. For a single person,A_dis a realization, not a partition. Statements like “70% of this individual’s intelligence is genetic” do not type-check against the model. The dashboard exposes population variance only.
6. Distortion-aware reading
Each component of the decomposition has a public-discourse failure mode. The model’s job is to make the failure visible, not to suppress it.
| Component | Common misreading | What the model says |
|---|---|---|
V(A_d) (high) | “Genes determine outcomes” | Population variance. Says nothing about a specific person’s prospects. |
V(A_i) (large) | “Family environment doesn’t matter” | The opposite: this term is family environment, mediated by parental genotypes that correlate with parental phenotypes. |
V(A_LD) | Usually invisible to public discourse | Inflates V(A) at the population level by ~10–25% via AM-induced LD between trait-relevant alleles (Yengo 2018: 14–23% for height, matching the formula prediction). Does NOT on net inflate Falconer twin h² — AM actually biases Falconer downward, partially offsetting other classical-ACE biases (see §2.2 caveat). |
Cov(A_d, E_m) (active rGE) | “People shape their environments” → therefore environments don’t matter | They matter — the covariance term is their effect, just non-orthogonal to genes. |
| Twin h² ≥ within-family h² | ”Twin studies overestimate” | They estimate a different quantity (population additive variance vs. direct effect). Both are real. |
Multivariate D large | ”Sexes are categorically different” | D is a distribution distance; individuals across the distributions still overlap substantially. Dimensional, not taxonic. |
Univariate d small | ”Sexes are essentially the same” | True for the dimension cited, false in the multivariate space. |
D1 and D2 (the two heaviest distortions) both operate by selecting a subset of these readings. The model doesn’t resolve the political dispute, but anyone running the dashboard should be able to see why each side is technically correct about the term they’re highlighting and incomplete about the rest.
7. Adversarial + steelman
Four objections to the formalization itself. The strongest version of each, then the model’s honest response.
Objection 1 — Variance bookkeeping is not a causal model
The decomposition partitions variance into named components, but it never specifies why A_d produces phenotype P rather than the reverse. A regression coefficient β_d from a within-family GWAS is not a causal effect; it is a statistical association under specific design assumptions. Calling the decomposition a “generating function” is false advertising — it generates expected variance given parameters, not actual phenotype given a causal mechanism.
Steelman: This is the strongest objection because it is the same disagreement that drives O1 (Plomin vs Turkheimer). The model accommodates both readings rather than picking one: under the Plomin reading, β_d is a causal coefficient and the decomposition is generative in the strong sense; under the Turkheimer reading, β_d is a regression coefficient that happens to be unbiased under within-family identification, but the underlying biology is unspecified. Both readings predict the same variance budget, which is why the data hasn’t yet decided between them.
Response: Acknowledged. The model is more accurately described as a conditional variance generating function — given parameters, it generates the expected variance pattern. The causal interpretation of those parameters is exactly what’s contested, and the decomposition’s value is precisely that it lets both interpretations be expressed in a shared language. Stage 4 (data) is where the disagreement gets sharper: the test is whether β_d, within-family moves under environmental intervention. Plomin predicts no, Turkheimer predicts yes, and the model can express both predictions cleanly.
Objection 2 — ACE assumptions are unrealistic enough that “twin h²” is not really estimating anything physical
EEA fails (MZ co-twins are treated more similarly than DZ); shared-environment effects vary by zygosity; non-shared environment for siblings is correlated with shared parental treatment. Stack the violations and the entire ACE framework is just a parametric reparameterization of the data, not an estimation of underlying components.
Steelman: Joseph and Richardson’s critique of behavior genetics rests partly on this argument: the assumptions that make twin h² meaningful are violated enough to make the resulting numbers epistemically empty. The strongest version isn’t that twin studies are “wrong” but that they’re under-determined — multiple causal worlds produce the same rMZ and rDZ patterns.
Response: Partially conceded. Classical ACE is under-determined and the assumption-violation issue is real. However, two empirical findings constrain the under-determination: (a) SNP-based heritability (which uses unrelated individuals and bypasses EEA entirely) recovers a substantial fraction of twin h² across major traits — for height about 60% with common SNPs alone (rising to ~80% with whole-genome sequencing that captures rare variants), for cognitive ability about 25–40%, for educational attainment about 30–50%. The fraction varies by trait but is consistently non-trivial — the EEA-bias-only explanation for twin h² is empirically untenable; (b) MZ-reared-apart studies (Bouchard 1990 and updates) reproduce the basic Wilson Effect pattern with EEA structurally absent. The model takes twin h² as an upper bound on direct + indirect additive variance, not as a precise estimate. The method gradient is what makes the imprecision survivable: comparing twin to within-family bounds the gap.
Objection 3 — Additive form misses dominance and epistasis
Dominance variance V_D and epistatic variance V_I (gene-gene interactions) are real and measurable. Twin studies fitting ADE models routinely find non-trivial V_D. The additive-only generating function is a simplification that loses information.
Steelman: For some traits — height (V_D ≈ 0), educational attainment (V_D ≈ 0–0.05) — dominance is small and the additive simplification is fine. For others — psychiatric disorders, where ADE models often outperform ACE — non-additive variance is potentially substantial. The additive-only model is not “wrong” so much as inappropriate for that subset of traits.
Response: Concede the scope limit. The generating function as written is for traits where polygenic-additive architecture dominates (which is most psychological traits, per Hill, Goddard & Visscher 2008’s argument that even where dominance exists, additive variance often captures most of the variance because of allele-frequency distributions). For severe psychopathology and other traits with substantial V_D, an extended model would replace A_d with A_d + D_d and add Cov(A_d, D_d) cross-terms. The dashboard does not currently expose this; the prose acknowledges the boundary in §5.
Objection 4 — Multivariate D conflates measurement structure with reality
Mahalanobis D depends on Σ. Σ is the within-sex covariance of measured traits, which depends on which traits you measure, how you measure them, and how the population varies. Different measurement panels produce different Ds for the same underlying difference. The disattenuated D = 2.71 from Del Giudice is not a property of human nature; it’s a property of the 16PF + the U.S. sample + the latent-variable model.
Steelman: This is correct and underappreciated. D is not a population parameter in the way that μ_F − μ_M is. It is a model-relative summary statistic. Two researchers using different but equally defensible measurement panels can produce D values that differ by a factor of two or more.
Response: Conceded fully. The multivariate-D module’s value is comparative, not absolute. It tells you given a measurement structure, how multivariate aggregation magnifies the apparent sex difference relative to any single dimension. The module’s pedagogical purpose is to show why the same data (panel of dimensions) supports both “small per-dimension differences” and “large multivariate distance” — neither claim is wrong, but neither is the whole answer. The dashboard surfaces the dependency on Σ via the ρ̄ slider so users can see how D moves under different correlation structures.
8. Open questions that the model exposes (Stage-4 inputs)
The formal apparatus makes four open questions sharper than verbal discussion alone:
-
O1 (PGS interpretation). The decomposition treats
β_d · g_Tas a direct genetic term. Plomin’s “PGS is a real biological cause” reading takesβ_das a structural causal coefficient. Turkheimer’s “PGS is a summary of correlated environments” reading saysβ_dis contaminated by uncontrolledCov(A_d, E_m). The two interpretations make different predictions about howβ_dshould change under environmental intervention. Stage 4 question: for traits with large enough within-family GWAS, doesβ_d, within-familymove under intervention (schooling reform, nutrition shifts) the way Plomin predicts (it shouldn’t) or the way Turkheimer predicts (it should)? -
O3 (Gender Equality Paradox). The multivariate algebra in 3.4 shows that
Ddepends on the inter-trait correlation structureΣ. IfΣdiffers between high-equality and low-equality societies,Dwill differ even if univariateμ_F − μ_Mdifferences are fixed. Stage 4 question: doesΣ(the personality covariance matrix itself) change across societies, or only the means? This is a different empirical question than “are the differences innate.” -
O6 (what
E_sactually is). The model treats stochastic developmental noise as an unmodeled residual. As Stage 4 data accumulates, candidates (immune/microbial, peer-network, epigenetic, measurement error) can be peeled off intoE_mand the residualE_sshould shrink. Stage 4 question: how much of the current ~50% personalityE_scan be moved intoE_mgiven current measurement panels? -
O7 (cross-disorder rg post-AM correction). Module 3.4’s bridge between cross-trait phenotypic correlations and genetic correlations under AM (Border 2022, LAVA-Knock 2024) gives a formal correction. Stage 4 question: applied at scale to the full psychiatric-disorder rg matrix, what fraction of the cross-disorder genetic correlations survive the correction?
The two questions deferred from Section 1 (PGS portability and the GEP causal mechanism) are not sharpened by the model — they require new measurement, not new math.
9. Handoff to Stage 4 (data pipeline)
The model defines five parameter sets that Stage 4 needs to populate:
| Parameter | Source | Trait coverage |
|---|---|---|
β_d, β_i | Within-family GWAS (Howe 2022, Okbay 2022) | EA, height, BMI, cognitive ability, depressive symptoms, smoking — extending |
m (cross-spouse phenotypic correlation) | UK Biobank, HUNT, MoBa | EA, height, BMI, cognition, neuroticism — well-covered |
h²(t) calibration | Bouchard 2013, Briley & Tucker-Drob 2013 longitudinal twin | Cognition (well-covered); personality (sparse); psychopathology (very sparse) |
Σ for sex-difference module | Del Giudice 2012, Schmitt 2008, Kaiser 2020 | 16PF, NEO, Big Five |
share_rare | Wainschtein 2025 | Height, EA, several psychiatric — extending |
The single highest-value Stage-4 deliverable: a per-trait table of (twin h², SNP h², WGS h², within-family h², m, β_i/β_d) at adulthood, ideally with cohort-by-age stratification. Most of the components already exist in published consortium summaries; the table is mostly aggregation, not new analysis.
10. Connection to adjacent topics
-
Parent-to-Child Transmission (planned). The
A_iterm is the formal answer to “how much does parenting matter beyond genes for outcomes that look genetic.” That topic should adopt this generating function as its starting point and refineβ_iby domain (cognition vs. personality vs. health behaviors) and by mechanism (vocabulary input, expectation-setting, neighborhood selection). The Nivard et al. 2024 finding — that indirect genetic effects extend beyond the nuclear family — impliesβ_ishould be further decomposed into a parent-level term and a dynastic/extended-family term. -
Evolution-Modernity Mismatch (planned). The
μ(t)population-mean trajectory is the formal home of secular shifts (Flynn rise, Flynn reversal, age-of-puberty drift). Within-cohort within-sibship designs are the cleanest separator of genuine environmental shifts inμ(t)from compositional or selection artifacts. The Pietschnig 2024 finding that the positive manifold itself may be weakening across recent cohorts suggestsμ(t)is not a one-dimensional curve but a moving structure of which abilities are gaining or losing — which the current scalar form does not capture.
(A connection to a planned “Bedrock Generating Functions” topic was floated in pass 1 but dropped — the analogy was real but too loose to do useful work here, and any cross-domain claim should live in that topic’s own formalization rather than be asserted from this one.)
11. Glossary (formalization-specific additions)
This section’s symbols are listed in the order they appear in the generating function. The lit-review and topology glossaries cover the field-level terminology (h², SNP, GWAS, PGS, AM, rGE, GxE, etc.) and are not duplicated here.
| Symbol / term | Meaning |
|---|---|
P_i(t) | Phenotype of person i at developmental age t. Scalar (for one trait at a time); see §2 scope note for the multi-ability extension. |
A_d | Direct additive genetic component — Σ_k β_k · g_{ik} over causal SNPs the person inherits, evaluated as if mating were random. |
A_i | Indirect additive genetic component (genetic nurture) — additive effect of parents’ (and extended-family) genotypes operating through the rearing environment. |
A_LD | AM-induced LD inflation — additional additive variance from non-random mating creating linkage among causal variants. |
C | Shared-environment residual not already absorbed by A_i. |
E_m / E_s | Measured non-shared environment (lead, schooling, etc.) / stochastic developmental noise (the unmodeled residual). |
I | Interaction terms: G×E, G×G (epistasis), G×age. |
μ(t) | Population-mean trajectory at age t (developmental norm, not a person-level term). |
β_d / β_i | Direct / indirect genetic regression coefficients on phenotype, estimated from within-family / parental-genotype designs. |
g_T / g_NT | Polygenic score from offspring’s transmitted alleles / parents’ non-transmitted alleles. |
m | Cross-spouse phenotypic correlation (assortative-mating strength on the measured trait). |
r_δ | Cross-spouse correlation in additive genetic value; = m · h²_obs at AM equilibrium. The dashboard uses this directly (no fixed-point iteration) since Wilson h²(t) is already the equilibrium quantity. |
k | AM-induced correlation between transmitted and non-transmitted alleles within parents; appears in the genetic-nurture variance identity (§3.3). |
V_A* | Additive genetic variance at AM equilibrium; V_A* = V_A / (1 − r_δ) in the Crow–Felsenstein form. The dashboard observes V_A* directly via h²_obs and uses the formula to partition it into V(A_d) and V(A_LD). |
h²(t) | Heritability as a function of age; logistic form h²_∞ / (1 + exp(−k·(t − t_50))) in §3.2. Earlier passes used a saturating exponential which fit the asymptote but rose too fast in childhood; the logistic is the smallest functional change that captures the empirical sigmoidal pattern. |
Σ | Trait-level (within-sex or within-group) covariance matrix used in multivariate-D calculation. |
Mahalanobis D | Multivariate generalization of Cohen’s d: √(Δμᵀ Σ⁻¹ Δμ). |
ρ̄ | Average inter-trait correlation in Σ; the equicorrelated approximation collapses D² to d²·n/(1+(n−1)ρ̄) (§3.4). |
block-orthogonal | Decomposition where major components are orthogonal to the residual environment but cross-terms within components (e.g. Cov(A_d, A_i)) are explicit, not zero. |
method gradient | The relationship twin h² ≥ WGS h² ≥ SNP h² ≥ within-family h² driven by which components each estimator includes. |
12. Cruxes for this model
The topology had cruxes for the field. This stage’s cruxes are different — they are the load-bearing assumptions of the formalization itself. If any one flips, the model needs to be restructured.
| Crux | Load-bearing claim | What would flip it |
|---|---|---|
| C1 | Within-family GWAS effect estimates are an unbiased estimate of β_d. The whole A_d / A_i separation depends on this. | A demonstration that within-family designs have a systematic confound (e.g., differential parental treatment that correlates with offspring genotype) that biases β_d by more than ~10%. So far Howe 2022 / Okbay 2022 within-sibship GWAS are mutually consistent and consistent with trio-based estimates, suggesting the confound is bounded. |
| C2 | AM equilibrium has been reached or is close enough that the partition relation r_δ = m·h²_obs holds. | A demonstration that recent population-scale shifts in assortment (educational stratification expansion since 1970, online dating since 2010) have moved populations far from equilibrium for psychologically-relevant traits — at which point the observed r_δ would lag the formula’s prediction. Currently no direct evidence the partition is mis-calibrated; would require longitudinal m-by-cohort data. |
| C3 | Hyperpolygenic architecture: A_d = Σ β_k g_{ik} over thousands of small effects, no single locus dominates. | Discovery that for a major psychological trait class, ~5–10 large-effect variants account for >50% of V(A_d). Currently true only for the severe psychiatric tail (autism with ID, severe schizophrenia spectrum), where the model already concedes scope (§5.1). Would generalize to mainstream cognition only if a CRISPR-era discovery overturned the polygenic consensus. |
| C4 | A_d, A_i, A_LD are jointly identifiable given the available designs. | A demonstration that twin/SNP/within-family/WGS estimators are not sufficient to disentangle all three (e.g., that AM-LD and rare-variant contributions are mutually confounded in a way no current design can break). This would force collapsing the decomposition or treating one component as a residual. Active concern: rare-variant heritability in WGS may itself be inflated by AM-LD among rare variants, which would muddy C4. |
| C5 | Equicorrelated Σ is a useful approximation for the multivariate sex-difference module. | A demonstration that real personality covariance matrices have block-structured (or low-rank) Σ that produces qualitatively different D from the equicorrelated approximation. Already partially true: 16PF has known higher-order factor structure, which is why the equicorrelated approximation undershoots Del Giudice’s latent-variable result. Crux holds in a weakened form: equicorrelated is useful pedagogically but not quantitatively for high-dimensional panels. |
The most consequential of these is C4. If A_d, A_i, and A_LD cannot be jointly identified by current designs, the variance decomposition reduces to a coarser partition (genetic-additive vs everything else), and the field-level dispute about how much “genetic” effect is environment-mediated remains parametrically unresolvable rather than just empirically pending.