Psychology of Individual Differences
What the science actually says about psychological variation — heritability, environmental shaping, gene-environment interaction, sex differences, cognitive capacity. A minefield of motivated reasoning where the actual generating functions are obscured by politics.
The topic running through the LLM Iterate pipeline. The point is not to take a side but to map the dependency structure under the noise — what the field actually knows, what it doesn’t, and where motivated reasoning from each direction concentrates.
Stage 1 (lit review) is the landscape: ~50 years of twin studies, the post-2010 genomic era, the assortative-mating and genetic-nurture corrections that are actively rewriting older interpretations, sex differences univariate vs. multivariate, the neurodiversity / dimensional turn, and the open questions that aren’t currently answerable with the methods available.
Stage 2 (topology) is the dependency graph — three foundational assumptions (twin validity, GWAS signal, g exists) carry roughly 70% of the inferential weight. Six crux nodes are where collapse propagates farthest. The graph also encodes where each direction of motivated reasoning attacks, so the same evidence base can be read four ways without changing the underlying numbers.
Stages 3–5 will formalize the variance decomposition into equations, run it against available data, and ship a small interactive tool.
What the science actually shows about psychological variation — heritability, environmental shaping, gene-environment interaction, sex differences, cognitive capacity. The post-2010 genomic era confirmed mid-20th-century behavior genetics while demolishing the candidate-gene paradigm; assortative mating and genetic nurture are actively rewriting older interpretations.
TLDR
Virtually every measured psychological trait is moderately to substantially heritable, hyper-polygenic, and shaped by environments that are themselves partly genetic in origin. The post-2010 genomic era confirmed the core findings of mid-20th-century behavior genetics while simultaneously demolishing the candidate-gene paradigm that dominated psychiatry from 1996–2010. Twin heritability for psychological traits averages ~49% (Polderman et al. 2015); molecular GWAS increasingly accounts for this through thousands of tiny-effect common variants plus rarer large-effect variants in neurodevelopmental conditions.
A crucial methodological development since 2018 is the recognition that assortative mating and gene-environment correlation systematically inflate GWAS-derived estimates. Border et al. (2022, Science) showed that cross-trait assortative mating alone can account for substantial fractions of reported genetic correlations — including some psychiatric cross-disorder correlations previously attributed to shared biology. Kong et al.’s (2018) “genetic nurture” finding demonstrated that roughly half of population-level polygenic score prediction for educational attainment reflects environmentally-mediated parental effects, not direct genetic causation. These corrections don’t eliminate genetic influence — they reframe it.
The field’s most contested findings are not the ones most disputed in public discourse: heritability is settled science, the “parenting wars” are largely resolved, and the candidate-gene-by-environment literature has collapsed. What remains genuinely open is mechanistic — how genes build minds, why the gender equality paradox exists, what drives the Flynn Effect’s reversal, and whether between-population mean differences have any genetic component (a question currently unanswerable with available methods, not “settled” in either direction). The generating function for psychological variation is not “genes vs. environment” but a tightly coupled developmental system in which genetic predispositions, environments created by genetically-similar parents, assortative mating patterns, stochastic noise, and cultural context are deeply entangled.
This document is structured for someone building a formal model of psychological variation. Each section flags effect sizes, replication status, consensus, live debate, and ideological distortion from any direction.
1. Heritability: The Foundation Finding
The Polderman meta-analysis
Polderman et al. (2015, Nature Genetics) meta-analyzed 50 years of twin research — 17,804 traits, 14.5 million twin pairs, 2,748 publications — and reported a mean heritability across all human traits of 49%. Polderman et al., 2015. For ~69% of traits, simple additive ACE models fit cleanly.
Turkheimer’s Laws and the Fourth Law
Turkheimer’s Three Laws (2000) — all human behavioral traits are heritable; shared family environment is smaller than genes; substantial variance is explained by neither — were extended by Chabris et al. (2015) with the Fourth Law: a typical behavioral trait is associated with very many genetic variants of tiny effect. This emerged from the failure of candidate-gene studies and the polygenic architecture revealed by GWAS.
What heritability actually means (and doesn’t)
Heritability is a population statistic, not an individual one. Saying IQ is 70% heritable does not mean 70% of any person’s IQ comes from genes. It is not deterministic (height is ~80% heritable yet rose ~10cm in 20th-century Europe through nutrition) and not immutable (h² changes with environment — if all environments became identical, h² would approach 1.0). The most common misinterpretation collapses statistical variance partitioning into causal mechanism.
Twin and molecular estimates by domain
| Domain | Twin h² | SNP-h² | Largest GWAS | Loci | Best PGS R² |
|---|---|---|---|---|---|
| Adult IQ / g | 0.70–0.80 | ~0.20 | 269,867 (Savage 2018) | 205 | ~0.05 |
| Educational attainment | ~0.40 | ~0.13 | 3M (Okbay 2022) | 3,952 | 0.12–0.16 |
| Big Five (avg) | 0.40–0.60 | 0.05–0.18 | 449k (Nagel 2018) | 136 (N) | <0.05 |
| Political orientation | ~0.40 | — | — | — | — |
| Religiosity | 0.30–0.45 | — | — | — | — |
| Risk tolerance | ~0.30 | 0.05 | 1M (Karlsson Linnér) | 99 | <0.02 |
| Schizophrenia | 0.60–0.80 | 0.24 | 320k (Trubetskoy 2022) | 287 | 0.07–0.10 |
| Bipolar | 0.70–0.85 | 0.18–0.20 | 414k (Mullins 2021) | 64 | 0.04 |
| MDD | 0.35–0.40 | 0.09 | 807k (Howard 2019) | 102 | 0.02–0.03 |
| ADHD | 0.74 | 0.14 | 225k (Demontis 2023) | 27 | 0.04–0.06 |
| Autism | 0.80 | 0.12 | 46k (Grove 2019) | 5 | <0.03 |
Note: Political orientation and religiosity are included because they are among the few adult traits where shared family environment (C) remains substantial (~20–30%), unlike personality and cognition where C ≈ 0 by adulthood. See Alford, Funk & Hibbing (2005); Hatemi et al. (2014).
The Wilson Effect
Bouchard (2013) documented that IQ heritability rises with age — from ~20% at age 5 to ~80% by adulthood, with shared-environment effects dropping from ~55% in early childhood to roughly zero by adolescence. Bouchard 2013. Briley & Tucker-Drob (2013) explained the mechanism: early genetic effects are amplified across development through gene-environment correlation (niche-picking). Briley & Tucker-Drob 2013. This finding is robust, replicated, and counterintuitive — genetic differences become more expressed as people age into self-selected environments.
The “missing heritability” problem
The gap between twin h² (~0.70 for IQ) and SNP-h² (~0.20) launched a decade of debate. Wainschtein et al. (2022, Nature Genetics) essentially closed it for height: using whole-genome sequencing in 25,465 unrelated individuals, h² recovered to 0.68 when rare and low-LD variants were included. Wainschtein et al. 2022. For psychological traits the same pattern is emerging. The current synthesis: missing heritability is partly real (rare variants, dominance, GxE) and partly artifactual (twin overestimation from assortative mating and rGE, measurement noise).
Assortative mating: a pervasive inflation source
Assortative mating (AM) — the tendency for partners to resemble each other on traits — has emerged as a major methodological concern. People mate assortatively on education (spousal r ≈ 0.40–0.60), IQ (~0.40), personality (~0.10–0.20), height (~0.20), and psychiatric conditions. AM has three consequences for genetic estimates:
- Inflated heritability: AM increases additive genetic variance across generations by creating linkage disequilibrium among causal variants. Most twin studies underestimate heritability by ignoring AM (counterintuitively); GWAS-based SNP-h² may be inflated by AM-induced LD. Border et al. 2022, Nat Commun.
- Inflated genetic correlations: Border et al. (2022, Science) introduced cross-trait assortative mating (xAM) and showed that phenotypic cross-mate correlations explain R² = 74% of the variance in reported genetic correlation estimates. Some psychiatric cross-disorder genetic correlations — previously interpreted as evidence of shared biology — may be largely or entirely attributable to xAM. Border et al. 2022.
- Inflated PGS prediction: Within-family PGS effects are roughly half of population-level effects for educational attainment (Okbay 2022), partly because AM and population stratification inflate between-family comparisons.
Plomin (2022, Behav Genet) argues this is a prediction-vs-explanation distinction: AM inflates causal genetic estimates but doesn’t invalidate PGS as predictors, since AM-induced variance is real population variance. Plomin 2022. This is technically correct but sidesteps the question of why PGS predict — whether through direct genetic causation or through correlated environments created by assortatively-mating parents.
The genetic nurture revolution
Kong et al. (2018, Science) used 21,637 Icelandic probands with parental genotypes to compute polygenic scores from non-transmitted parental alleles. The non-transmitted PGS predicted offspring educational attainment at ~30% the magnitude of transmitted PGS — meaning parental genotypes shape children via environments they create, even for alleles never inherited. Kong et al. 2018. Okbay et al. (2022, EA4, Nature Genetics) confirmed this in 3 million people: within-family direct effects are roughly half the population-level PGS magnitude. Okbay et al. 2022. The implication: GWAS effect sizes for socially-valued traits are inflated by indirect/dynastic effects, and roughly half of what we used to call “genetic transmission” is actually environmentally mediated by genetically-similar parents.
2. Environmental Shaping: Real, But Smaller and Weirder Than Common Sense Suggests
The shared-vs-non-shared distinction is the single most disorienting finding for laypeople. Across hundreds of twin and adoption studies, the shared family environment (C) accounts for ~0% of variance in adult personality and most adult cognition (Plomin & Daniels 1987; Bouchard & McGue 2003). Parental warmth, parenting style, dinner conversations, books in the home — once genetic transmission is controlled, almost none of this leaves a measurable trace on adult personality. Important exceptions where C remains substantial: educational attainment (~20%), antisocial behavior, religiosity (~25%), political orientation (~20–30%), and childhood (but not adult) externalizing.
What “non-shared environment” actually is
Turkheimer & Waldron’s (2000) meta-analysis of measured non-shared environmental predictors found these accounted for only ~2% of variance in outcomes. Plomin’s recent verdict: non-shared environment is “real but largely random,” more akin to stochastic developmental noise — differential peer experiences, illness, idiosyncratic events, measurement error — than systematic experience.
The Equal Environments Assumption
Critics (Joseph, Charney; Fosse et al. 2015) note that MZ twins are treated more similarly than DZ twins. The central empirical defense: Kendler et al. (1993) showed misperceived-zygosity twins had phenotypic similarity tracking true zygosity. Kendler et al. 1993. MZ-reared-apart correlations (Bouchard’s Minnesota study) closely match MZ-reared-together correlations. Felson (2014) reanalysis: EEA is “not strictly valid, but bias is modest.” Modern SNP-based heritability estimates entirely bypass EEA and give somewhat lower but still substantial h². Verdict: EEA is approximately valid; bias is modest (~10–20% inflation) for most traits.
Environmental factors with robust causal effects on cognition
A small number of environmental insults have large, replicated, causal effects — typically asymmetric (removing severe deficits matters more than enrichment above normal):
- Lead exposure: Lanphear et al. (2005) pooled 7 prospective cohorts: blood lead 1→10 µg/dL → −6.2 IQ points, with steeper slope at low concentrations. Causal status: strong.
- Severe iodine deficiency: 8–12 IQ point cost; supplementation recovers ~8.7 points (Bougma 2013; Qian 2005). RCT-supported, globally replicated.
- Heavy prenatal alcohol: FAS produces mean IQ ~70; Mendelian-randomization confirms causation.
- Schooling: Ritchie & Tucker-Drob (2018) meta-analyzed 142 effect sizes / 600,000 participants across three quasi-experimental designs. Each year of education raises IQ by 1–5 points (mean ~3.4), persisting into old age. The most consistent durable IQ-raising intervention identified. Ritchie & Tucker-Drob 2018.
- Air pollution (PM2.5): ~−0.27 IQ points per 1 µg/m³ (Aghaei 2024); smaller per-unit than lead but exposure is widespread.
The Scarr-Rowe interaction (SES × heritability)
Turkheimer et al. (2003) reported that in impoverished families, IQ heritability was ~10% with shared environment ~60%; in affluent families this reversed. Turkheimer et al. 2003. Tucker-Drob & Bates (2016) meta-analysis: replicated in U.S. samples but absent in Western European/Australian samples — likely because more universal healthcare/education reduces environmental variance at the bottom. Status: real but context-dependent and U.S.-specific.
Parenting effects: the Harris correction, partially reversed
Judith Rich Harris (1995; The Nurture Assumption 1998) argued that within-normal-range parenting has minimal long-term effects on adult personality. The empirical core was correct: C ≈ 0 for adult personality. But Harris overstated her case. Korean-American adoption studies (Sacerdote 2007; Beauchamp et al. 2023) show real but modest causal effects of family environment on educational attainment, BMI, drinking, smoking — transmission coefficients ~25% of biological-family magnitude. Severe deprivation/abuse causes clear damage. The accurate position: within the normal Western range, parenting style has small effects on adult personality; family environment has measurable but modest effects on attainment outcomes; severe parenting variation matters substantially.
Neighborhood and peer effects
Chetty & Hendren (2018, QJE) used 5+ million U.S. cross-county movers with sibling fixed effects: each year of childhood exposure to a 1-SD better county raises adult income by ~0.5–0.7%. Chetty & Hendren 2018. Moving to Opportunity reanalysis (Chetty, Hendren & Katz 2016): children moving before age 13 had adult earnings 31% higher than controls. Place matters, but accumulates slowly across many years of exposure.
The Flynn Effect and its reversal
Flynn (1984, 1987) documented ~3 IQ points/decade gains across the 20th century. Causes contested: nutrition, schooling, infectious-disease reduction, test sophistication, smaller families — no single mechanism established. Bratsberg & Rogeberg (2018, PNAS) used Norwegian within-family conscript data to demonstrate that both the Flynn Effect and its post-1990s reversal are environmentally driven (visible within sibships, ruling out dysgenic/compositional explanations). Bratsberg & Rogeberg 2018. Similar declines now reported in Denmark, Finland, the Netherlands, France, the UK, and Germany. The reversal’s cause is unknown — this is one of the field’s most important open questions.
3. Gene-Environment Interplay: rGE Wins, Candidate-GxE Collapsed
Three types of gene-environment correlation
The Plomin/DeFries/Loehlin (1977) framework distinguishes passive rGE (parents transmit both genes and correlated rearing environment), evocative rGE (heritable child traits elicit specific responses), and active rGE / niche-picking (individuals select environments matching genetic propensities). Kendler & Baker’s (2007) systematic review shows essentially every measured environment is itself heritable (15–35%) — meaning observational claims like “parental warmth causes child outcomes” are confounded by passive rGE. The genetic nurture and within-family PGS results (Section 1) quantify this: population-level “genetic” prediction is roughly half indirect environmental effects of genetically-similar parents.
The candidate gene × environment collapse
Caspi et al. (2003, Science) reported that 5-HTTLPR short-allele carriers showed elevated depression risk under stress. The paper became one of the most cited in psychiatry (>9000 citations). It collapsed:
- Risch et al. (2009, JAMA): meta-analysis of 14 studies, N=14,250 — no evidence. Risch et al. 2009.
- Culverhouse et al. (2018): pre-registered collaborative meta-analysis, 31 datasets, N=38,802 — definitively no evidence.
- Border et al. (2019, Am J Psychiatry): examined 18 most-studied depression candidate genes in N up to 443,264. No clear evidence for any candidate gene polymorphism on depression. As a set, candidate genes were no more associated with depression than non-candidate genes. Border et al. 2019.
- Duncan & Keller (2011): 96% of novel candidate-GxE studies were significant; only 27% of replication attempts were. Duncan & Keller 2011.
MAOA × maltreatment (Caspi et al. 2002) is the partial exception that survived meta-analysis (Byrd & Manuck 2014) — modest male-specific interaction, but smaller than originally reported.
Differential susceptibility / orchid-dandelion
Belsky & Pluess (2009) reframed “risk alleles” as “plasticity alleles” — some individuals are more reactive to environments “for better and for worse.” Belsky & Pluess 2009. The theory is generative; the empirical record is mixed. Recent systematic reviews find that interactions between child characteristics and parenting rarely replicate across cohorts and developmental domains. Distinguishing differential susceptibility from diathesis-stress requires very large, preregistered samples. de Villiers et al. 2018.
Epigenetics: real biology, oversold psychology
DNA methylation, histone modifications, and non-coding RNA regulation are real, well-characterized mechanisms important in development. The controversy concerns whether environmentally-induced epigenetic marks are faithfully transmitted across generations in humans. They generally are not.
- Heard & Martienssen (2014, Cell): in mammals, two waves of near-complete epigenetic reprogramming erase most acquired methylation marks. Robust transgenerational epigenetic inheritance occurs in plants and C. elegans; in humans it remains largely speculative. Heard & Martienssen 2014.
- Dutch Hunger Winter (Heijmans et al. 2008): real within-individual epigenetic effect persisting decades, not evidence of transmission to grandchildren.
- Yehuda’s Holocaust FKBP5 study (2016): tiny sample (n=8 control parents), opposite-direction effects in parents vs. offspring, no germline measurement. Yehuda’s own group failed to replicate. The “trauma is inherited epigenetically” narrative is not supported by current evidence.
Critical periods: solid developmental neuroscience
Hensch (2005, Nat Rev Neurosci) provides a mechanistically rigorous account of cortical critical-period plasticity. Hensch 2005. GABAergic maturation (parvalbumin-positive interneurons) gates onset; perineuronal nets and myelin-associated inhibitors close periods. This represents the high end of how environmental experience shapes brain structure — genuine, replicated, and mechanistically understood.
4. Sex and Gender Differences: Large Where You’re Not Told They Are
Sex differences are one of psychology’s most ideologically distorted areas — distorted by both minimization and overstatement. The actual picture: small differences in average cognitive ability, large differences in interests and physical aggression, moderate-to-large multivariate personality differences, and a robust but mechanistically contested gender equality paradox.
Cognitive abilities
Mental rotation shows d ≈ 0.56–0.73 male advantage (Voyer et al. 1995), among the largest cognitive sex differences documented. Mean math performance: d ≈ 0.05–0.10 (Lindberg et al. 2010) — essentially no average difference. Writing: substantial female advantage. School grades favor girls overall (Voyer & Voyer 2014). At extreme tails (95th–99th percentile) males outnumber females ~2:1 in many countries — driven by slightly greater male variance (~3–15% higher) compounding at extremes.
Personality: univariate vs. multivariate framing
Univariate Big Five differences are moderate: women higher on Neuroticism (d ≈ 0.40) and Agreeableness (d ≈ 0.40). Del Giudice, Booth & Irwing (2012) computed multivariate Mahalanobis D = 2.71 on 16PF data from 10,261 Americans, implying ~10% overlap between male and female personality profiles. Del Giudice et al. 2012. Hyde’s (2005) “Gender Similarities Hypothesis” — most differences trivial or small — is mathematically compatible but tells a very different qualitative story. Both univariate and multivariate framings should be reported jointly; selective use is ideological.
Interests: the largest sex difference in psychology
Su, Rounds & Armstrong (2009, Psych Bulletin) meta-analyzed 503,188 people: the People-Things dimension d = 0.93, with engineering interest d = 1.11. Su et al. 2009. These are very large by psychological standards and the largest in the entire literature on psychological sex differences.
Aggression
Archer (2004): physical aggression d ≈ 0.40–0.60 male; trait anger near zero. Males commit ~95% of homicides globally. Archer 2004. Indirect/relational aggression: Card et al. (2008) found differences trivial (d < 0.10), challenging the “girls do indirect aggression equally” narrative.
The Gender Equality Paradox (replicated; mechanism contested)
A robust empirical pattern across at least four domains: personality, preference, interest, and depression-rate differences are larger in more gender-equal and wealthier countries.
- Schmitt et al. (2008): 55-nation Big Five study — differences largest in egalitarian Western cultures. Schmitt et al. 2008.
- Falk & Hermle (2018, Science): 80,000 adults, 76 countries — sex differences in 6 economic preferences positively related to GDP and gender equality.
- Stoet & Geary (2018): STEM Gender-Equality Paradox — more gender-equal countries had smaller female share of STEM graduates. A corrigendum addressed methods; the core correlation remained robust.
The correlation is robust. The causal mechanism — innate-expression release in wealthy environments vs. measurement artifacts vs. ecological confounds — is genuinely contested.
Mental health asymmetries
Depression female:male ≈ 2:1; anorexia ~10:1 female; ADHD diagnosis ~2–3:1 male; antisocial personality, substance use, completed suicide all male-skewed; autism ~3–4:1 male; schizophrenia roughly equal but more severe early-onset in males.
Biological mechanisms
CAH girls (prenatally elevated androgens) show masculinized toy preferences and play patterns (Kung et al. 2024 meta-analysis). Same-sex-typed toy preferences in vervet and rhesus monkeys parallel human findings, supporting partial biological mediation. Wood & Eagly’s social role theory faces empirical challenge from the gender equality paradox.
5. Cognitive Ability and Intelligence
The g-factor
Spearman’s 1904 finding of a positive manifold — every cognitive test correlates positively with every other — is arguably the most replicated finding in psychology. A first unrotated principal factor captures 40–50% of variance in any sufficiently broad battery. van der Maas et al. (2006) mutualism model offers an alternative: g may be an emergent network property of reciprocally beneficial cognitive processes during development, not a unitary biological cause. van der Maas et al. 2006. Most working researchers treat g as a robust statistical regularity whose causal architecture is unsettled.
Structure: CHC theory
Carroll’s (1993) three-stratum theory — g at top, ~8–10 broad abilities (Gf, Gc, Gv, Ga, Gs, Gsm, Glr, Gq, Grw), ~70+ narrow abilities — was integrated with Cattell-Horn into the Cattell-Horn-Carroll (CHC) framework, which underlies modern IQ tests.
Predictive validity
Schmidt & Hunter (1998): corrected GMA validity for job performance r ≈ 0.51. Sackett et al. (2022) argued corrections were too aggressive; re-estimate: r ≈ 0.31 uncorrected / ~0.42 corrected. GMA remains among the most predictive selection tools. Childhood IQ predicts educational attainment at r ≈ 0.50–0.70. Calvin et al. (2011) meta-analysis (1.1M, 22,453 deaths): each 1-SD higher childhood IQ → ~24% lower all-cause mortality. Calvin et al. 2011.
Lifespan stability
Lothian Birth Cohort: age 11 → age 90 corrected correlation r ≈ 0.67. Deary et al. 2013. About one-third of variance in mental ability at 90 is accounted for by ability at 11.
Group differences in test scores: the most distorted area
Roth et al. (2001) meta-analysis (N=6.2M): U.S. Black-White cognitive ability gap d ≈ 1.0 (~15 IQ points). Dickens & Flynn (2006): Black IQ rose 4–7 points relative to whites between 1972–2002 (about one-third of the gap). Dickens & Flynn 2006. The gap exists, has narrowed somewhat, and has not closed.
The mainstream contemporary position (Nisbett et al. 2012; Turkheimer, Harden & Nisbett 2017): within-group heritability does not license between-group inferences (Lewontin’s point); Martin et al. (2019, Nature Genetics) demonstrated PGS lose ~4.5x prediction accuracy in African-ancestry individuals due to differential LD and allele frequencies, meaning current PGS cannot validly compare mean genetic predisposition across continental ancestry groups. Mostafavi et al. (2020) showed PGS portability also breaks down within Europeans across SES strata.
The honest scientific position: gaps in test scores are real, partly narrowing, and their causes are not currently identifiable as genetic, environmental, or both — direct evidence is absent and mainstream geneticists treat the question as not currently answerable.
Distortion from the hereditarian direction: treating g-loadedness as evidence of genetic etiology (environmental causes can also be g-loaded); citing fringe admixture studies published in weak-peer-review venues; conflating absence of evidence with agnosticism. Distortion from the environmentalist direction: claiming gaps have closed when they only partly narrowed; dismissing IQ as “culturally biased” despite measurement-invariance evidence; overstating stereotype threat (Flore & Wicherts 2015 meta-analysis showed publication-biased modest effects).
Brain correlates
Brain volume × IQ: r ≈ 0.24 (Pietschnig et al. 2015, 2022). P-FIT theory (Jung & Haier 2007): intelligence supported by parieto-frontal network. Jung & Haier 2007.
Creativity and intelligence
The “IQ ≈ 120 threshold” hypothesis is largely disconfirmed (Weiss et al. 2020). Intelligence and creativity correlate ~r = 0.20–0.30 across the range. Openness to Experience is the personality trait most reliably correlated with creative achievement (~0.30–0.40).
6. Personality and Temperament
The Big Five (OCEAN) and HEXACO
The Big Five emerged from the lexical hypothesis. Heritability is ~40–60% per twin studies; SNP-h² is 8–18%. Nagel et al. (2018) identified 136 loci for neuroticism in 449,484 people. Nagel et al. 2018. The ReGPC consortium (2025) reports 703 loci for neuroticism in 1M+ participants. ReGPC 2025.
Roberts & DelVecchio (2000): rank-order stability rises from ~0.31 in childhood to ~0.74 by midlife (cumulative continuity). Roberts & DelVecchio 2000. Roberts et al. (2006) the maturity principle: mean-level increases in Conscientiousness, Agreeableness, and Emotional Stability with age, especially in young adulthood. Bleidorn et al. 2022 update.
HEXACO (Ashton & Lee): lexical studies in 12+ languages consistently yield six factors, the sixth being Honesty-Humility. H predicts integrity-related criteria incrementally over Big Five. Ashton & Lee 2008.
Temperament: the developmental foundation
Temperament research constitutes a parallel tradition to adult personality, focused on biologically-grounded individual differences emerging in infancy.
Rothbart’s model identifies three overarching dimensions: Surgency/Extraversion (activity, positive affect, approach), Negative Affectivity (fear, anger, sadness, discomfort), and Effortful Control (attentional regulation, inhibitory control, low-intensity pleasure). Effortful Control is particularly important — it is the self-regulatory component of temperament, developing primarily during ages 2–7 as the anterior attention network matures, and is a strong predictor of later externalizing problems, academic success, and conscience development.
Kagan’s Behavioral Inhibition (BI) framework focuses on extreme phenotypes: ~15–20% of infants show high-reactive patterns (vigorous motor activity and distress to novel stimuli at 4 months) who become behaviorally inhibited toddlers — cautious, avoidant with unfamiliar people and situations. BI maps approximately onto low Surgency + high Negative Affectivity (especially fear). Kagan’s longitudinal studies showed BI is moderately heritable (~50%), associated with higher resting heart rate and amygdala excitability, and predicts elevated risk for social anxiety disorder in adolescence (OR ~2–4). However, ~60% of high-reactive infants do not become clinically anxious adults — biology is a foundation, not a constraint.
Thomas & Chess’s (1977) “goodness of fit” model — later empirically supported — emphasized that temperamental difficulty per se doesn’t predict poor outcomes; the match between child temperament and environmental demands does.
The temperament → personality continuity is increasingly well-documented: infant Surgency maps onto adult Extraversion; infant Negative Affectivity onto Neuroticism; infant Effortful Control onto Conscientiousness. The mapping is imperfect — adult personality includes social-cognitive layers (identity, values, narrative) absent in temperament.
Cross-cultural universality
McCrae & Terracciano (2005): clean Big Five replication in 50 cultures. McCrae & Terracciano 2005. Gurven et al. (2013) challenged this with the Tsimane forager-horticulturalists, where the full Big Five did not robustly emerge. Gurven et al. 2013. Consensus: 3 factors (E, A, C) replicate cross-linguistically; the full Big Five replicates well in Indo-European languages; non-WEIRD samples sometimes show structural deviations.
Dark traits and the D factor
Paulhus & Williams (2002): the Dark Triad (Machiavellianism, narcissism, psychopathy). Buckels, Jones & Paulhus (2013) added everyday sadism. Moshagen, Hilbig & Zettler (2018) proposed the D factor — a general tendency to maximize individual utility while disregarding others — as the common core, mapping strongly onto low Honesty-Humility. Moshagen et al. 2018.
Person-situation debate: resolved
The Mischel (1968) critique — cross-situational consistency rarely exceeds r ≈ 0.30 — was resolved through aggregation, interactionism (Mischel & Shoda’s CAPS model), and Fleeson’s within-person variability framework. The modern consensus: persons, situations, and their interactions all matter.
Personality predicts outcomes as strongly as IQ and SES
Roberts et al. (2007) “The Power of Personality”: meta-analytic comparison shows personality effects on mortality, divorce, and occupational attainment are indistinguishable in magnitude from SES and cognitive ability effects. Roberts et al. 2007. Conscientiousness predicts mortality through health behaviors with large effect size. Bogg & Roberts 2004.
Recent theoretical developments
DeYoung’s Cybernetic Big Five Theory (2015): traits as parameters of a cybernetic goal-pursuit system. DeYoung 2015. Mõttus et al. (2017): “personality nuances” research argues item-level traits capture incremental valid variance below the facet level. Mõttus et al. 2017.
7. Neurodiversity and Psychopathology: Dimensional, Polygenic, Transdiagnostic
The genomic era has produced three conclusions that fundamentally reshape psychiatric nosology: all major psychiatric conditions are highly heritable, hyper-polygenic, and substantially genetically overlapping across diagnostic categories.
Headline findings by disorder
- Schizophrenia: twin h² ~80%; Trubetskoy et al. (2022, Nature): 287 loci; SNP-h² ~24%. Trubetskoy et al. 2022. Environmental risk factors: urban birth (~2× risk), high-potency cannabis (OR ~3.9), migration, obstetric complications.
- Bipolar: twin h² ~70–85%; Mullins et al. (2021): 64 loci; rg(SCZ,BD) ~0.7. Mullins et al. 2021.
- Major Depression: h² ~37%; Howard et al. (2019): 102 loci; SNP-h² ~9%. Howard et al. 2019. Strong rg with neuroticism (~0.7).
- ADHD: twin h² ~74%; Demontis et al. (2023): 27 loci. Demontis et al. 2023. Negative rg with educational attainment and IQ.
- Autism: twin h² ~80%; Grove et al. (2019): 5 common-variant loci plus substantial rare/de novo variants of large effect (CHD8, SCN2A, SYNGAP1). Grove et al. 2019. Common-variant PGS positively correlated with IQ and education; ID-comorbid autism (rare-variant-driven) negatively correlated.
Cross-disorder pleiotropy (with assortative mating caveat)
Brainstorm Consortium (2018, Science): substantial genetic correlations among psychiatric disorders. Cross-Disorder PGC (Lee et al. 2019, Cell): across 8 disorders, 109 pleiotropic loci, three clusters — compulsive, mood/psychotic, early-onset neurodevelopmental. Lee et al. 2019.
Critical caveat: Border et al. (2022, Science) showed that cross-trait assortative mating can generate spurious genetic correlations between phenotypes with entirely distinct genetic bases. Some fraction of reported psychiatric cross-disorder genetic correlations may reflect xAM rather than shared biology. The magnitude of this artifact is actively being quantified and represents a major revision in progress.
The p-factor
Caspi et al. (2014): a single p (general psychopathology) factor fit Dunedin cohort data better than three-factor models — analogous to g for cognitive ability. Caspi et al. 2014. Higher p associated with greater impairment, familiality, worse developmental histories. Replicated in dozens of samples. Interpretations contested: genuine common liability, statistical artifact of bifactor over-extraction, or a reflection of impairment/distress per se.
Dimensional alternatives: HiTOP and RDoC
HiTOP (Kotov et al. 2017): a quantitatively-derived dimensional alternative to DSM organized hierarchically. RDoC (NIMH 2009–): six dimensional neurobiologically-grounded research domains. Both converge with taxometric evidence (most psychopathology is dimensional, not taxonic) on the dimensional turn in psychiatric science.
Polygenic scores in clinics: not yet
Best PGS R² ~7–10% for schizophrenia. PGS alone does not outperform family history. PGS performance drops 50–70% in non-European-ancestry populations — a major equity and portability problem.
The neurodiversity framework: scientific–identity tensions
Coined by Singer (1998), the neurodiversity paradigm reframes autism, ADHD, dyslexia as natural variation rather than pathology. The framework has legitimate ethical force but operates in tension with deficit-oriented findings for severe presentations (profound autism with ID, epilepsy, self-injury). A defensible position recognizes both the reality of impairment at the severe end and the population-level continuous variation that grades into normality.
8. Key Researchers and Labs
| Researcher | Affiliation | Central contribution |
|---|---|---|
| Robert Plomin | King’s College London | Behavioral genetics synthesis; Blueprint; GPS |
| Eric Turkheimer | University of Virginia | Three Laws; Scarr-Rowe; philosophical foundations |
| K. Paige Harden | UT Austin | Genetic Lottery; causal inference with PGS |
| Avshalom Caspi / Terrie Moffitt | Duke / King’s | p-factor; Dunedin cohort; (and candidate-GxE) |
| Ian Deary | Edinburgh | Lothian Birth Cohorts; cognitive epidemiology |
| Elliot Tucker-Drob | UT Austin | Education-IQ meta-analysis; Wilson Effect mechanisms |
| Daniel Benjamin / SSGAC | UCLA | EA GWAS consortium; social-science genomics |
| Colin DeYoung | Minnesota | Cybernetic Big Five Theory; personality neuroscience |
| Jay Belsky | UC Davis | Differential susceptibility |
| Marco Del Giudice | UNM | Multivariate sex differences |
| Janet Hyde | Wisconsin | Gender similarities hypothesis |
| David Geary | Missouri | Sex differences in math/STEM |
| Brent Roberts | UIUC | Personality development; maturity principle |
| Alexander Young | UCLA | Genetic nurture; within-family methods |
| Richard Border | Harvard/UCLA | Candidate gene demolition; xAM |
| Peter Hatemi / John Hibbing | Penn State / Nebraska | Genopolitics; heritability of political attitudes |
9. The Integrated Picture: What Generates Psychological Variation
The model
A formal model of individual psychological variation should treat the person as the joint product of:
(a) A hyper-polygenic genome encoding thousands of small-effect predispositions (plus some rare large-effect variants in neurodevelopmental conditions). Twin h² for most traits falls in 0.40–0.80.
(b) Substantial gene-environment correlation through passive (parents transmit genes + correlated environments), evocative (child traits elicit responses), and active (niche-picking) channels. Roughly half of population-level PGS prediction reflects indirect/environmental mediation by genetically-similar parents, not direct genetic causation.
(c) Assortative mating inflating additive genetic variance, genetic correlations between traits, and PGS prediction accuracy. This is a recently-quantified source of systematic bias in nearly all genetic estimates.
(d) A small set of large-effect environmental insults — lead, severe iodine deficiency, heavy prenatal alcohol, severe deprivation — plus schooling (~3.4 IQ points/year). Effects are typically asymmetric: removing severe deficits matters more than enriching above-normal environments.
(e) Substantial stochastic developmental noise — the dominant source of the non-shared environment, which accounts for ~50% of personality variance and is not yet well-characterized mechanistically.
(f) Cultural/institutional contexts that modulate which genetic predispositions are expressed and rewarded (WEIRD effects, gender equality paradox, Scarr-Rowe interaction, Flynn Effect).
(g) Developmental unfolding across time — temperament in infancy (biologically grounded reactivity and regulation) becomes personality in adulthood (adding social-cognitive layers), with heritability increasing across the lifespan (Wilson Effect) and rank-order stability rising to ~0.74 by midlife.
Where political distortion is strongest, by direction
From the environmentalist/blank-slate direction: dismissing twin study validity wholesale; overstating Scarr-Rowe; promoting transgenerational epigenetic narratives that exceed evidence; dismissing IQ as culturally biased despite measurement-invariance findings; overstating stereotype-threat magnitudes; minimizing the gender equality paradox.
From the hereditarian direction: citing within-population heritability to license between-population genetic inferences; citing fringe admixture studies as if mainstream; treating g-loadedness of gaps as evidence of genetic etiology when environmental causes can also be g-loaded; ignoring the assortative mating and genetic nurture corrections to PGS.
From the “gender similarities” direction: selective citation of d ≈ 0.05 for math to imply no differences anywhere; obscuring multivariate D ≈ 2.71 with univariate framing; minimizing d ≈ 0.93 people-things interest differences.
From popular evolutionary psychology: treating dimensional differences as taxonic; extrapolating from small ds to categorical claims; overgeneralizing from specific tasks to broad domain claims.
Open questions worth modeling
- Mechanistic interpretation of PGS: Plomin’s “causal genetic” view vs. Turkheimer’s “weak genetic explanation” — genuinely open.
- Flynn Effect reversal: cause unknown; one of the most important open questions in differential psychology.
- Gender equality paradox mechanism: innate-expression release vs. measurement artifacts vs. wealth confounds — unsettled.
- Between-population cognitive differences: currently scientifically unanswerable (PGS portability too poor; cross-ancestry GWAS at scale don’t exist). Honest position: unresolved, not settled in either direction.
- The causal architecture of g: latent common cause vs. emergent network property (mutualism) — the positive manifold is not in dispute; what generates it is.
- What non-shared environment actually is: stochastic noise, epigenetic variation, immune/microbial variation, differential peer networks — largely uncharacterized despite accounting for ~50% of personality variance.
- Assortative mating correction magnitudes: how much do AM and xAM corrections change the substantive picture of genetic architecture and cross-trait pleiotropy? Active area of revision.
10. Load-Bearing Assumptions and Falsification Conditions
This section makes explicit which conclusions in this review depend on which assumptions, and what evidence would substantially revise or flip them. Ordered roughly by how much of the document’s picture collapses if the assumption fails.
Assumption 1: The twin method provides approximately valid variance decomposition
What depends on it: Nearly all h² estimates in Section 1’s table, the C ≈ 0 finding for adult personality, the Wilson Effect, the Scarr-Rowe interaction.
Status: Approximately valid. SNP-h² estimates (which bypass EEA entirely) give lower but still substantial heritability for every trait measured. MZ-reared-apart designs converge with MZ-reared-together. Felson (2014) estimates ~10–20% EEA-induced inflation, not enough to eliminate the core finding.
What would flip it: SNP-h² for psychological traits systematically converging on <0.05 (would suggest twin h² is mostly EEA artifact). Or: a large, well-powered MZ-reared-apart study finding IQ correlations <0.40 (current estimates ~0.70). Neither has occurred.
Robustness verdict: HIGH. The convergence of twin, adoption, and molecular methods on moderate-to-substantial heritability is the most replicated finding in the field.
Assumption 2: GWAS identifies real genetic signal (not just population structure and AM artifacts)
What depends on it: The entire PGS enterprise, genetic nurture estimates, cross-disorder pleiotropy findings, the “missing heritability” narrative.
Status: Substantially valid but with known inflation. Within-family PGS effects are non-zero for educational attainment (~half of population effects), meaning direct genetic signal exists. But the magnitude of AM and stratification inflation is still being quantified.
What would flip it: Within-family PGS effects for most traits converging on ~zero (would mean population-level PGS prediction is entirely indirect/environmental). Current evidence: within-family effects are reduced but clearly non-zero for EA, BMI, height; less well-characterized for personality and psychiatric traits.
Robustness verdict: MODERATE-HIGH for the existence of direct genetic effects; MODERATE for their precise magnitude, which is actively being revised downward.
Assumption 3: g is a real dimension of individual variation (not a measurement artifact)
What depends on it: The entire intelligence section (Section 5), predictive validity claims, group-difference discussions, the CHC structure.
Status: The positive manifold is among the most replicated findings in psychology. Whether g is a latent common cause or an emergent network property (mutualism) is unsettled, but both interpretations preserve g’s predictive validity and the meaningfulness of individual differences in general cognitive ability.
What would flip it: A sufficiently broad, well-constructed cognitive battery where the first principal component explains <15% of variance (would undermine the positive manifold). Or: successful interventions that consistently raise one cognitive ability while lowering others (would violate the manifold’s structure). Neither has been demonstrated.
Robustness verdict: HIGH for g as a statistical regularity with predictive validity. MODERATE for g as a unitary biological mechanism (mutualism remains a viable alternative).
Assumption 4: Sex-difference effect sizes from meta-analyses are not primarily measurement artifacts
What depends on it: The gender equality paradox, the claim that interest differences (d = 0.93) are among psychology’s largest, the multivariate personality finding (D = 2.71).
Status: Interest measures (Su et al. 2009) use well-validated instruments; the d = 0.93 holds across inventories and cultures. The Del Giudice multivariate D is sensitive to the number of variables included and the specific battery, though the qualitative finding (large multivariate difference despite moderate univariate ds) is robust across datasets. CAH and non-human primate evidence provides independent convergent support for biological mediation of interest differences.
What would flip it: A large cross-cultural study using behavioral (not self-report) interest measures finding d < 0.30 for people-things. Or: evidence that the gender equality paradox disappears when using non-self-report personality measures (reference-group effects could inflate self-report differences in egalitarian countries). Current evidence: Falk & Hermle (2018) used incentivized behavioral measures for some preferences and found the paradox held, but full behavioral replication across all domains is incomplete.
Robustness verdict: HIGH for the existence of substantial sex differences in interests and aggression. MODERATE for the precise magnitude of multivariate personality differences. MODERATE for the gender equality paradox’s causal interpretation.
Assumption 5: The candidate-GxE collapse generalizes — specific gene × environment interactions are mostly small or nonexistent
What depends on it: Section 3’s dismissal of 5-HTTLPR and similar findings, the shift toward polygenic approaches.
Status: For candidate genes, the collapse is definitive (Border et al. 2019). But this does not necessarily mean polygenic-score × environment interactions are also null. PGS × environment work is younger, uses better methods, and could in principle yield robust results.
What would flip it: Multiple large, pre-registered PGS × measured-environment studies showing robust, replicable interactions explaining >5% of variance. Current evidence: a few suggestive findings (PGS-for-education × compulsory schooling reforms) but nothing approaching the scale or replication needed for confidence.
Robustness verdict: HIGH for the candidate-gene collapse. LOW-MODERATE confidence in the broader claim that specific GxE interactions are generally small — this is an extrapolation from the candidate-gene failure, and the polygenic GxE literature is too young to draw strong conclusions.
Assumption 6: Cross-disorder genetic correlations reflect shared biology (pleiotropy)
What depends on it: The p-factor interpretation, HiTOP structure, the “dimensional turn” in psychiatry, transdiagnostic treatment rationales.
Status: Substantially challenged by Border et al. (2022). Cross-trait assortative mating can generate spurious genetic correlations between traits with entirely distinct genetic bases. The R² = 74% finding means most of the variance in genetic correlation estimates tracks spousal phenotypic correlations — though this does not prove all genetic correlations are spurious (some genuine pleiotropy surely exists).
What would flip it: Within-family designs showing that cross-disorder genetic correlations survive AM correction at >50% of current estimates. Or: identification of specific shared biological pathways (e.g., synaptic pruning variants affecting both SCZ and BD) that don’t depend on LD induced by AM.
Robustness verdict: MODERATE. The dimensional/transdiagnostic pattern is likely real but inflated. The magnitude of genuine pleiotropy vs. AM artifact is one of the field’s most active methodological debates.
11. Toward Topology: Structure for the Next Phase
This section identifies the natural graph/network structure embedded in this literature, to facilitate the transition from landscape analysis to formal topology mapping.
Natural node types
- Trait nodes: Cognitive abilities (g, Gf, Gc, Gv, Gs…), personality dimensions (Big Five/HEXACO factors and facets), temperament dimensions (Surgency, Negative Affectivity, Effortful Control), psychopathology spectra (internalizing, externalizing, thought disorder), interests (people-things, RIASEC), political/moral attitudes
- Mechanism nodes: Genetic architecture (common polygenic, rare large-effect, de novo), environmental factors (lead, iodine, schooling, deprivation, neighborhoods), developmental processes (critical periods, niche-picking, genetic nurture, AM), stochastic noise
- Method nodes: Twin studies, adoption studies, GWAS, PGS, within-family designs, Mendelian randomization, meta-analysis
- Population-level modifier nodes: SES (Scarr-Rowe), culture (WEIRD), gender equality index, historical period (Flynn Effect)
Natural edge types
- Genetic correlations (with AM caveat): e.g., rg(SCZ, BD) ≈ 0.7; rg(EA, IQ) ≈ 0.7; rg(neuroticism, MDD) ≈ 0.7
- Developmental continuity: temperament → personality (Surgency → Extraversion; Effortful Control → Conscientiousness)
- Causal environmental effects: lead → IQ (−6.2 pts per 10 µg/dL); schooling → IQ (+3.4 pts/year)
- Predictive validity edges: g → job performance (r ≈ 0.42); Conscientiousness → mortality; EA PGS → income
- Methodological dependency: twin h² → SNP-h² → PGS R² (each constraining the next)
- Taxonomic hierarchy: g → broad abilities → narrow abilities (CHC); p → spectra → subfactors → syndromes (HiTOP)
- Moderation edges: SES × heritability (Scarr-Rowe); gender equality × sex differences (GEP); age × heritability (Wilson Effect)
Key structural features for the graph
- Two parallel hierarchies (CHC for cognition, HiTOP for psychopathology) that share genetic correlations at the top level (g correlates with p inversely)
- A developmental cascade from temperament (infancy) through personality (adulthood) through outcomes (mortality, income, relationships), with heritability increasing and shared-environment decreasing across the lifespan
- A methodological funnel from twin estimates (broadest, highest h²) through molecular estimates (narrower, lower h²) through within-family estimates (narrowest, lowest but most causally clean)
- Cross-domain genetic correlations that form a web connecting cognition, personality, and psychopathology — but with the critical caveat that an unknown fraction may be AM artifact rather than biological pleiotropy
Highest-leverage next steps for topology phase
-
Build the trait correlation matrix: Assemble published genetic correlations (from LD Score regression / GWAS) among the ~20–30 most well-characterized traits spanning cognition, personality, and psychopathology. Annotate each with AM-corrected estimates where available. This matrix is the empirical backbone of the topology.
-
Map the developmental cascade: Create a directed graph from temperament → personality → outcomes with age-indexed heritability and stability coefficients as edge weights. This captures the time dimension that a static correlation matrix misses.
-
Formalize the variance decomposition: For each major trait, create a standardized decomposition: [direct genetic] + [genetic nurture/indirect] + [AM-induced] + [shared environment] + [measured non-shared environment] + [stochastic residual]. Where values are unknown, flag them explicitly. This is the generating function skeleton that the formalization phase will flesh out.
Dependency graph of the lit review. Three categories of high-stakes node (foundational cruxes / reframer nodes / logical guardrails) plus weakest links, four variant views, three Stage-3 options, an objections section, and a glossary. Updated through 2024-2025 literature on AM correction, within-family GWAS at scale, PGS portability, Scarr-Rowe collapse, GEP replication, and missing-heritability closure.
TLDR
The lit review documents what the science says about psychological variation. This topology asks a sharper question: what depends on what? Strip the field down to its load-bearing structure and the picture is surprisingly clean. Three foundational assumptions — that twin/adoption methods give approximately valid variance decomposition (A1), that GWAS signal reflects real genetic effects rather than population structure or assortative-mating artifact (A2), and that a general factor of cognitive ability g exists as a real dimension of individual variation (A3) — sit upstream of most of the empirical and synthesis nodes in the graph; if any one of them flipped, large regions would have to be rebuilt. Everything else is either an empirical claim resting on these foundations, a methodological prerequisite that lets the foundations be tested, a logical necessity that constrains how the empirical claims can be interpreted, or a generating mechanism that explains why the empirical pattern looks the way it does.
The high-stakes nodes split into three categories — keeping them separate is the single most useful conceptual move in this topology. Foundational cruxes (A1 twin validity, A2 GWAS signal real, A3 g exists) are the assumptions that, if falsified, force rebuilding regions of the picture. Reframer nodes (G2/E6 passive rGE / genetic nurture; G6/E7 cross-trait assortative mating) don’t break the picture if reversed — they change what it means; their magnitudes are being actively quantified and their precise share of population-level “genetic” effects is the field’s most consequential open quantity. Logical guardrails (L1 variance-ratio definition; L4 Lewontin firewall) cannot be falsified — they can only be ignored, which is exactly how most public-discourse misuse of the field proceeds. Conflating these three types under a single label of “important findings” is a major source of bad-faith debate.
The field’s weakest links are not where public discourse focuses heat. Mainstream contests over “is heritability real” target settled findings (A1+E1 are robust); the 2025 whole-genome-sequencing work (Wainschtein et al. 2025, Nature) now closes ~88% of the pedigree-based heritability gap, so the “missing heritability” critique is also substantially answered. The actual fragile zones in 2026 are: (a) the generalization from candidate-GxE failure to all-GxE-is-small — partially holding the null (Allegrini 2020 for education, 2025 systematic review for depression) but the literature is still too young for confidence; (b) Scarr-Rowe has weakened further — Ghirardi et al. 2024 found 39/42 PGI×SES interactions in the opposite (compensatory) direction, so “deprivation suppresses heritability” is now evidence-thin; (c) the polygenic-score → mechanism inference (Plomin’s “causal” view vs. Turkheimer’s “weak explanation” view) remains genuinely undecided; (d) the magnitude of AM-correction across psychiatric cross-disorder rg estimates is now being actively addressed — Ma, Wang, Border et al. 2024 (LAVA-Knock) is the first method to systematically reduce xAM-induced bias, with the field-wide answer likely in 2–3 years. The Flynn-reversal cause and the Gender Equality Paradox mechanism remain open mechanistic questions, but they are open in a different way — the empirical patterns themselves are robust; only the explanation is contested. The 2025 GEP systematic review (Herlitz et al.) actually strengthened the pattern across personality, verbal abilities, episodic memory, and negative emotions.
This topology is the input to model formalization (Stage 3). The cleanest formalization target is the variance decomposition equation: V(trait) = V(direct genetic) + V(genetic nurture / indirect) + V(AM-induced LD) + V(shared environment) + V(measured non-shared) + V(stochastic) + 2·Cov(genes, environment) + interaction terms — with each term parameterized by trait, age, and population context, and with the AM and rGE terms being where current methodological revision is concentrated. The four variant views below (Vulnerability / Flow / Minimal / Politicization) read the same graph through different lenses to make the formalization choices easier.
The graph
All ~50 nodes and their dependencies. Click a node for detail; drag to rearrange.
Click a node for its claim and load-bearing weight; hover an edge for the relation type; drag to rearrange. The variant toggles read the same graph through different lenses (vulnerability, flow, minimal claim set, politicization).
How to read this graph
Every node in the lit review collapses to one of eight types. Edges between them carry one of seven relations. Together they make the structure inspectable.
Node types
| Code | Type | What it is |
|---|---|---|
| A | Foundational assumption | A claim the field cannot operate without; if false, large downstream regions collapse |
| M | Methodological prerequisite | A study design or estimation tool that must work for the empirical claims to be testable |
| E | Empirical claim | A specific measured finding with an effect size and replication status |
| L | Logical necessity | Follows from definitions or algebra; not empirically refutable |
| G | Generating mechanism | A causal process that explains a pattern (rGE, AM, niche-picking, critical periods) |
| S | Synthesis claim | An integrative statement combining multiple lower-level claims |
| O | Open question | Genuinely undecided with current methods or evidence |
| D | Distortion vector | Where motivated reasoning concentrates (typed by direction) |
Edge types
| Code | Edge | Meaning |
|---|---|---|
| dep | depends-on | If target collapses, source collapses |
| imp | implies | Logical implication |
| sup | empirically-supports | Evidence relation |
| conf | confounds / inflates | Artifact relationship (e.g., AM inflates rg) |
| mod | moderates | Changes magnitude (e.g., SES × heritability) |
| dev | develops-into | Temporal/developmental successor (temperament → personality) |
| corr | corrects | Within-family corrects between-family bias |
Weight scale (load-bearing weight, 1–5)
- 5 — crux node; collapse propagates across multiple sections of the lit review
- 4 — load-bearing within a section
- 3 — important but local
- 2 — corroborating
- 1 — decorative; could be removed without changing the picture
1. Node catalog
Each node carries: type code · weight · short claim · key citation · status. Status flags: ✓ (robust/replicated), ~ (partial/qualified), ? (contested), ✗ (refuted, kept as historical reference).
A — Foundational assumptions
| ID | Wt | Claim | Status |
|---|---|---|---|
| A1 | 5 | Twin/adoption methods provide approximately valid variance decomposition (EEA modestly violated but not fatally) | ✓ |
| A2 | 5 | GWAS signal reflects real genetic effects, not (only) population stratification or AM artifact | ✓ partial |
| A3 | 5 | A general factor of cognitive ability g is a real dimension of individual variation (positive manifold) | ✓ statistical / ? mechanism |
| A4 | 3 | Heritability findings apply to the population sampled, not to individuals or other populations (scope) | ✓ |
| A5 | 3 | Phenotypes are reliably and validly measurable across cultures and time | ~ |
| A6 | 3 | Most psychological variation is dimensional, not taxonic | ✓ |
M — Methodological prerequisites
| ID | Wt | Tool | Notes |
|---|---|---|---|
| M1 | 4 | Twin studies (MZ/DZ) at scale | Polderman 2015 meta: 14.5M pairs |
| M2 | 4 | Adoption studies, especially cross-cultural (Korean-American) | Sacerdote 2007; Beauchamp 2023 |
| M3 | 5 | GWAS at N ≥ 100k (ideally ≥ 1M for personality/EA) | Okbay 2022 (3M for EA) |
| M4 | 5 | Within-family designs (sibling FE, MZ-discordant, parent-offspring trios). Kong 2018; Okbay 2022 (EA, N=3M); Howe et al. 2022 Nature Genetics extended this to 178k siblings × 25 phenotypes — within-sibship estimates were systematically smaller than population estimates for height, EA, cognitive ability, depressive symptoms, smoking. The within-family approach is now mature beyond just educational attainment | ✓ |
| M5 | 4 | Polygenic scores (PGS) | Best R² ~0.16 for EA, ~0.10 for SCZ |
| M6 | 3 | Cross-trait LD-score regression for genetic correlations | Brainstorm 2018 |
| M7 | 3 | Mendelian randomization | For causal inference from observational data |
| M8 | 3 | Pre-registration & collaborative meta-analysis | Demolished candidate-GxE |
| M9 | 4 | Whole-genome sequencing (rare-variant capture). 2025 follow-up (UK Biobank ~500k, Wainschtein et al. 2025 Nature) captures ~88% of pedigree-based narrow-sense heritability across many traits (20% rare + 68% common variants). The “missing heritability” problem is now substantially resolved for many phenotypes | ✓ |
E — Empirical claims
Cognition / IQ:
| ID | Wt | Claim | Status |
|---|---|---|---|
| E1 | 5 | Mean trait heritability ≈ 0.49 across 17,804 traits (Polderman 2015) | ✓ |
| E2 | 5 | Shared environment C ≈ 0 for adult personality and most adult cognition | ✓ with exceptions (EA, religiosity, politics) |
| E3 | 4 | Wilson Effect: IQ heritability rises from ~0.20 (age 5) to ~0.80 (adulthood) | ✓ |
| E4 | 5 | Hyper-polygenic architecture: thousands of small-effect variants | ✓ (Turkheimer’s 4th Law, Chabris 2015) |
| E5 | 4 | Candidate-gene approach for psychiatric/personality traits failed (5-HTTLPR etc.) | ✗ original claims; ✓ collapse finding |
| E6 | 5 | Within-family PGS effects are ~½ population-level effects (genetic nurture) | ✓ for EA, BMI, height |
| E7 | 5 | Cross-trait assortative mating explains R²≈74% of variance in genetic correlation estimates | ✓ (Border 2022, Science) |
| E8 | 4 | Lead exposure 1–10 µg/dL → −6.2 IQ pts (Lanphear 2005) | ✓ |
| E9 | 4 | Each year of schooling adds ~3.4 IQ points, persisting into old age | ✓ (Ritchie & Tucker-Drob 2018) |
| E22 | 4 | Within-population heritability does not license between-population inference | ✓ logical |
| E23 | 4 | PGS prediction accuracy decays continuously along the genetic-distance continuum from training population (Pearson r = −0.95 between genetic distance and PGS accuracy across 84 traits, Ding et al. 2023, Nature). Reframes the older “discrete ancestry-group drop” picture (Martin 2019; Mostafavi 2020) | ✓ |
| E24 | 3 | Flynn Effect and its post-1990s reversal are both environmentally driven (within-sibship evidence) | ✓ pattern; ? cause |
| E25 | 2 | Scarr-Rowe: SES × heritability hypothesis (more genetic expression in higher-SES). Weakening further in 2024 — Ghirardi et al. 2024 found 39/42 PGI×SES interactions in education NEGATIVE (compensatory direction); only 1 significant positive. Pattern is now closer to “compensatory hypothesis holds, Scarr-Rowe fails” than to “context-dependent” | ✗ |
| E18 | 4 | Positive manifold: every cognitive test correlates positively with every other | ✓ |
| E26 | 3 | Childhood IQ → all-cause mortality: each 1-SD ≈ 24% lower mortality | ✓ (Calvin 2011) |
| E27 | 3 | Lifespan IQ stability: Lothian Birth Cohort age-11 → age-90 r ≈ 0.67 | ✓ |
| E28 | 3 | Severe iodine/alcohol/deprivation cause large asymmetric IQ effects | ✓ |
Personality / temperament:
| ID | Wt | Claim | Status |
|---|---|---|---|
| E29 | 4 | Big Five h² ≈ 0.40–0.60; cross-cultural replication for E/A/C | ✓ partial (Tsimane qualifier) |
| E30 | 4 | Cumulative continuity: rank-order stability rises to ~0.74 by midlife | ✓ |
| E31 | 4 | Maturity principle: mean-level ↑ in C, A, ES with age | ✓ |
| E32 | 3 | Temperament dimensions (Surgency, Negative Affectivity, Effortful Control) → adult personality | ✓ |
| E33 | 3 | Personality predicts mortality/divorce/income at magnitudes ≈ SES & cognition | ✓ (Roberts 2007) |
Sex differences:
| ID | Wt | Claim | Status |
|---|---|---|---|
| E10 | 4 | Multivariate Big Five sex difference: D ≈ 2.71 (~10% overlap) | ~ method-sensitive |
| E11 | 4 | People-things interest difference d ≈ 0.93 (largest in psychology) | ✓ |
| E12 | 3 | Mental rotation d ≈ 0.56–0.73 male advantage | ✓ |
| E13 | 3 | Math performance d ≈ 0.05–0.10 (essentially equal) | ✓ |
| E14 | 4 | Gender Equality Paradox: differences larger in egalitarian/wealthier countries. Herlitz et al. 2025 systematic review (54 articles, 27 meta-analyses) confirmed the pattern across personality, verbal abilities, episodic memory, and negative emotions — pattern replication has strengthened, not weakened | ✓ pattern; ? mechanism |
| E34 | 3 | Physical aggression d ≈ 0.40–0.60 male; ~95% of homicides male | ✓ |
| E35 | 2 | CAH girls show masculinized toy preferences; primate parallels | ✓ |
Psychopathology:
| ID | Wt | Claim | Status |
|---|---|---|---|
| E15 | 4 | All major psychiatric disorders highly heritable (h² 0.35–0.85) and hyper-polygenic | ✓ |
| E16 | 4 | Cross-disorder genetic correlations exist | ✓ existence; ? magnitude post-AM |
| E17 | 3 | A p factor (general psychopathology) fits cross-syndrome data | ✓ statistical; ? interpretation |
| E36 | 3 | Autism: common-PGS positively correlated with IQ; rare/de-novo drives ID-comorbid cases | ✓ |
| E37 | 2 | Critical-period plasticity (GABAergic, perineuronal nets) is mechanistically real | ✓ |
L — Logical necessities
| ID | Wt | Claim |
|---|---|---|
| L1 | 5 | Heritability is a population variance ratio; it does not partition individual phenotypes (mathematical form — A4 is the scope-of-claim sibling) |
| L2 | 4 | h² changes with environmental variance: hold genes constant, equalize environments → h² → 1 |
| L3 | 5 | Within-family designs control for between-family confounds (rGE, stratification, AM) |
| L4 | 5 | Within-population heritability provides no information about between-population mean differences (Lewontin). E22 in the empirical column is the applied form of this same point |
| L5 | 3 | Multivariate D ≥ max(univariate d) when component dimensions are positively correlated |
| L6 | 3 | Positive manifold permits both unitary-cause and emergent-network interpretations of g |
| L7 | 3 | Effect-size interpretation is scale-dependent (d=0.10 trivial in trait psychology, large in clinical) |
G — Generating mechanisms
| ID | Wt | Mechanism | Drives |
|---|---|---|---|
| G1 | 4 | Active rGE / niche-picking | E3 (Wilson Effect amplification) |
| G2 | 5 | Passive rGE. Wang 2021 / Isungset 2022 confirm indirect ≈ ½ direct genetic effect for EA. Nivard et al. 2024 found indirect genetic effects on offspring achievement extend beyond the nuclear family — dynastic / extended-family / community processes contribute, so the “parents transmit gene + correlated environment” framing understates the spread | E6 (genetic nurture); inflation of population-level h² |
| G3 | 3 | Evocative rGE | Heritability of “environments” (Kendler & Baker 2007) |
| G4 | 3 | Critical-period plasticity (GABAergic maturation) | Asymmetric environmental effects on early development |
| G5 | 4 | Assortative mating → LD induction | Inflates additive genetic variance, h² |
| G6 | 5 | Cross-trait AM → spurious genetic correlations | Confounds E16, E17 (p-factor) |
| G7 | 4 | Stochastic developmental noise | Dominant source of non-shared environment |
| G8 | 3 | Selection / niche construction across the lifespan | Bridges temperament → personality → outcome cascade |
S — Synthesis claims
| ID | Wt | Claim |
|---|---|---|
| S1 | 5 | ”Genes vs. environment” is the wrong frame; the system is tightly coupled (genome × rGE × AM × few large environmental insults × stochastic noise × culture × developmental unfolding) |
| S2 | 5 | Twin h² ≥ SNP h² ≥ within-family h² gradient quantifies AM/rGE/measurement inflation across estimation methods |
| S3 | 4 | Heritability ≠ destiny; high h² is compatible with large environmental shifts (height: h² ≈ 0.80, +10cm in a century) |
| S4 | 4 | Most “non-shared environment” is stochastic, not systematic — it accounts for ~50% of personality variance and is poorly characterized |
| S5 | 4 | Two parallel hierarchies (CHC for cognition, HiTOP for psychopathology) connected at the top by inverse g↔p genetic correlation |
| S6 | 4 | Developmental cascade: temperament (infant biological reactivity) → personality (adult social-cognitive layer added) → outcomes (mortality, attainment, relationships) with h² ↑ and shared-env ↓ across the lifespan |
O — Open questions
| ID | Wt | Question | Why it matters |
|---|---|---|---|
| O1 | 5 | Mechanistic interpretation of PGS: “causal genetic” (Plomin) vs. “weak explanation” (Turkheimer) | Determines what PGS prediction means |
| O2 | 3 | Cause of Flynn-effect reversal post-1990s | Empirical pattern robust (Bratsberg & Rogeberg 2018 within-sibship Norway). Mechanism still unsettled. Pietschnig et al. 2024 (Vienna 2005–2018 cohort) added a wrinkle: the positive manifold itself may be weakening — gains in some abilities aren’t tracking gains in others, suggesting the g-loading of the rise/fall is not constant. Hypothesized mechanisms (screens, reduced long-form reading, attention) circulate without empirical pinning |
| O3 | 4 | Causal mechanism behind Gender Equality Paradox | Innate-expression release vs. measurement artifact vs. confound — selection of explanation has political stakes |
| O4 | 5 | Between-population mean differences: any genetic component? | Currently scientifically unanswerable with available methods (PGS portability too poor, cross-ancestry GWAS at scale don’t exist). Honest position: unresolved, not settled in either direction |
| O5 | 3 | g architecture: latent common cause vs. emergent network (mutualism, van der Maas 2006) | Affects how interventions could in principle move g |
| O6 | 4 | What “non-shared environment” actually is: stochastic noise, immune/microbial, peer networks, epigenetic, measurement error | Largest unmodeled variance component in personality |
| O7 | 5 | Magnitude of AM-correction across the cross-disorder genetic correlation matrix | Active revision. Ma, Wang, Border et al. 2024 AJHG introduced LAVA-Knock — a local-genetic-correlation method that reduces xAM-induced bias. Methods to give the answer are now emerging, not just to flag the problem |
D — Distortion vectors (where motivated reasoning concentrates)
| ID | Direction | Targets | Failure mode |
|---|---|---|---|
| D1 | Blank-slate / environmentalist | A1, E1, E10–E14 | Dismiss twin studies wholesale; oversell transgenerational epigenetics; overstate stereotype threat; minimize sex differences via univariate-only framing |
| D2 | Hereditarian | L4, E22, E23, O4 | Ignore Lewontin; treat g-loadedness of gaps as evidence of genetic etiology; cite fringe admixture studies; ignore AM/rGE corrections to PGS |
| D3 | ”Gender similarities” minimization | E10, E11, E14 | Selective citation (math d=0.05) to imply no differences anywhere; obscure D=2.71 multivariate; minimize d=0.93 interest gap |
| D4 | Pop evpsych overgeneralization | E10–E14, A6 | Treat dimensional ds as taxonic; extrapolate small ds to categorical claims; overgeneralize from specific tasks to broad-domain claims |
2. Dependency cascade
The cascade reads from foundations up to synthesis, and from corrections back down to corrected claims.
Forward cascade (foundations → empirical claims → synthesis)
A1 ──dep──> M1, M2 ──sup──> E1, E2, E3, E25, E29
A2 ──dep──> M3, M5 ──sup──> E4, E6, E7, E15–E17, E22–E23
A3 ──dep──> E18 ──sup──> E26, E27 ──imp──> S5
A4 (scope) + L1 (form) ──guards──> interpretation of E1, E2, E3 and S3
A6 ──imp──> S5 (dimensional turn in psychiatry)
M9 ──corr──> E1 (closes missing-heritability gap)
M3 + M4 ──sup──> E6 (genetic nurture), E7 (xAM)
E5 (candidate-gene collapse) ──imp──> E4 (polygenic architecture confirmed by absence of large hits)
E1 + E2 + E3 + G1 ──imp──> S6 (developmental cascade)
E1 + E4 + G2 + G5 ──imp──> S2 (h² gradient by method)
E10 + E11 + E12 + E13 + L5 ──imp──> "small univariate, large multivariate" sex-difference picture
E14 + O3 ──imp──> mechanism-pending GEP
E15 + E16 + E17 ──imp──> S5 (HiTOP/p)
E22 + E23 + L4 ──imp──> O4 (between-pop unanswerable currently)
E1 + E2 + E4 + E6 + E7 + E8 + E9 + G1–G7 ──imp──> S1, S2 (integrated picture)
Backward / corrective cascade (newer evidence revises older claims)
G2 (passive rGE) ──corr──> E1 estimates (population-level overstates direct genetic)
G6 (cross-trait AM) ──corr──> E16 (some psychiatric rg's may be xAM artifact)
M4 (within-family) ──corr──> E6 magnitude (~½ of population PGS)
M8 (preregistration) ──corr──> E5 (collapsed candidate-GxE)
M9 (WGS) ──corr──> "missing heritability" interpretation
Distortion → target edges
D1 ──attacks──> A1, E1, E10, E11, E12, E14
D2 ──attacks──> L4, E23 (ignores), exploits A2 absent corrections from G2/G6
D3 ──attacks──> E10, E11 (selective univariate framing)
D4 ──attacks──> A6, L7
3. Where pressure concentrates
A common failure mode in this literature is to treat all high-stakes nodes as the same type of thing. They are not. The graph has three distinct categories of high-stakes node and one category of fragile claim — keeping these separate sharpens what the field actually needs to resolve.
3a. Foundational cruxes — falsification breaks regions of the picture
These are the empirical-or-methodological assumptions that, if wrong, force rebuilding large parts of the lit review.
A1 — Twin/adoption method validity. Carries Section 1 of the lit review; heritability-by-domain table; Wilson Effect. Robustness: HIGH (MZ-reared-apart, SNP-h² bypassing EEA, misperceived-zygosity all converge). Would flip if SNP-h² for psychological traits systematically converged on <0.05 — has not occurred.
A2 — GWAS signal is real (not artifact). Carries the PGS enterprise; genetic nurture estimates; cross-disorder pleiotropy; modern psychiatric genetics. Robustness: MODERATE-HIGH (within-family PGS effects are non-zero for EA, BMI, height — direct signal exists; AM/stratification inflation magnitudes still being quantified). Would flip if within-family PGS effects converged on zero across most traits.
A3 — g is a real dimension of cognitive variation. Carries Section 5 of lit review; predictive-validity claims; CHC structure; mortality/income predictions. Robustness: HIGH for g as a statistical regularity; MODERATE for g as unitary biological mechanism. Would flip if a broad cognitive battery had first-PC <15% or if interventions reliably moved one ability while lowering others. 2024 wrinkle: Pietschnig et al. 2024 reported the positive manifold may be weakening across recent cohorts — softly pressures A3 in a new way without refuting it.
3b. Reframer nodes — the answer is open and reshapes interpretation
These don’t break the picture if reversed; they change what the picture means. Their magnitudes are being actively quantified in 2024–2026 work. Conflating reframers with foundational cruxes is the most common conceptual error in pop-science treatments of this field.
G2 / E6 — Passive rGE / genetic nurture. Reframes the meaning of every population-level genetic estimate. Without G2, “genetic transmission” reads as direct biological causation; with G2, ~half is environmentally mediated by genetically-similar parents (Wang 2021 / Isungset 2022). Nivard et al. 2024 (Nat Hum Behav) showed indirect genetic effects extend beyond the nuclear family to dynastic / extended-family processes. The existence is robust; precise magnitude across all traits is still being quantified.
G6 / E7 — Cross-trait assortative mating. Reframes the cross-disorder genetic-correlation matrix and the p-factor’s interpretation. Border 2022 (Science) showed phenotypic cross-mate correlations explain R²=74% of variance in genetic-correlation estimates. Ma, Wang, Border et al. 2024 (LAVA-Knock) is the first method to systematically reduce xAM-induced bias. The share of any specific rg that is artifact vs. genuine pleiotropy is still pending.
3c. Logical guardrails — unfalsifiable but load-bearing for interpretation
These cannot be falsified — they are algebraic / definitional truths. They can be ignored, which is how most public-discourse misuse of the field happens.
L1 — Heritability is a population variance ratio, not an individual partition. Cannot be falsified. Public misreading of “70% heritable IQ” as “70% of any individual’s IQ comes from genes” is the failure of L1, not the science.
L4 — Within-population heritability does not license between-population mean inference (Lewontin firewall). Cannot be falsified — it is a logical/algebraic point. Can only be ignored. The empirical buttress today is E23 (PGS portability collapse along genetic-distance continuum, Ding 2023): even if you wanted to use within-pop methods to speak to between-pop differences, the methods don’t currently work.
3d. Decorative material (safe to compress)
Removable from the topology without changing the qualitative picture:
- E35 (CAH / primate toy preferences) — convergent evidence, not necessary
- E37 (specific GABAergic critical-period mechanisms) — biologically real, not load-bearing for the variation argument
- HEXACO Honesty-Humility specifics — incremental over Big Five
- Specific Dark-Triad subdimensions — D-factor synthesis (Moshagen 2018) carries more weight
- P-FIT brain network specifics — corroborate g but don’t establish it
- Yehuda Holocaust FKBP5 transgenerational findings — refuted/non-replicated; kept only as historical anchor for D1 distortion
- Specific candidate-gene findings (5-HTTLPR depression) — refuted; kept as historical anchor for the field’s methodological turn (M8)
4. Weakest links
These are the load-bearing pieces with the lowest current confidence. Targeted attack on any one would do the most damage to the integrated picture.
W1: Generalization from candidate-GxE failure to “all GxE is small” (E5 → broader claim)
Why fragile: The candidate-gene collapse is definitive. The extrapolation that polygenic-score × environment interactions are also small is an inductive leap, not a result. As of 2025, the picture is partially holding the null but not strengthening it. A 2025 systematic review of 56 PGS×E studies for depression found mostly null or small effects. A multivariable PGS×E study of educational achievement (Allegrini et al. 2020) found “no evidence that GxE effects significantly contributed to multivariable prediction.” UK Biobank work (2024) on distinct explanations of GxE shows that many apparent GxE signals are confounded by scale, ascertainment, or population structure. The candidate-gene-failure extrapolation is looking less like an inductive leap and more like a substantive empirical pattern — but the literature is still too young for a strong null.
Pressure test: Several large preregistered PGS×E studies finding interactions explaining >5% variance would substantially revise this corner of the picture.
W2: Scarr-Rowe (E25) — has substantially weakened since pass-0
Why fragile: The original meta-analytic picture was “replicates in US, fails in W. Europe / Australia” (Tucker-Drob & Bates 2016). Ghirardi et al. 2024 (Netherlands Twin Register, polygenic-index design across 42 PGI×SES interaction tests for educational outcomes) found 39/42 negative, 0 significant positive, 1 marginally significant positive — i.e., the opposite sign from Scarr-Rowe in most cases. The picture in 2026 is closer to “the compensatory hypothesis (more genetic expression in low-SES because constrained environments suppress non-genetic variance) is the better-supported pattern, at least for educational outcomes.” E25’s weight has been downgraded from 3 → 2 to reflect this. The narrative “deprivation suppresses heritability” — popular in policy discourse — is now evidence-thin.
W3: Plomin-vs-Turkheimer interpretation of PGS (O1)
Why fragile: Both views are compatible with current data. Determines what PGS means — direct biology vs. summary statistic of correlated environments. The field publishes ambiguously across both interpretations. Will likely be settled only by within-family-only PGS that are still well-powered.
W4: Magnitude of AM-correction across psychiatric cross-disorder rg matrix (O7) — methods now emerging
Why still fragile but improving: Border 2022 showed xAM explains R²=74% of variance in genetic-correlation estimates but didn’t prove all rg’s are spurious — some genuine pleiotropy surely exists. As of 2024, the field is moving from flagging the problem to building correction methods. Ma, Wang, Border et al. 2024 (American Journal of Human Genetics) introduced LAVA-Knock, a local-genetic-correlation method using knockoff inference to reduce xAM-induced bias; tested across 630 trait pairs in simulation and real GWAS, it substantially reduces but does not eliminate the bias. A 2024 study found AM genetic signatures across SCZ, BD, MDD, alcohol phenotypes, and Tourette syndrome — confirming xAM is not selective. What’s still pending: how much of the cross-disorder rg matrix and the p-factor genetic signal survives systematic application of AM-correction methods at scale. Likely answer in 2–3 years.
W5: Gender Equality Paradox mechanism (O3, E14) — pattern strengthened, mechanism still contested
Empirical pattern: more robust as of 2025. Herlitz et al. 2025 systematic review (54 articles, 27 meta-analyses, Perspectives on Psychological Science) found the paradox replicates across personality, verbal abilities, episodic memory, and negative emotions. Balducci et al. 2024 extended it to within-individual academic strengths cross-temporally. The “this won’t replicate” objection has weakened.
Mechanism: still contested. Three live candidates: (a) innate-expression release in resource-rich environments, (b) reference-group / self-anchoring artifacts in self-report (people compare to their gender peers, not to humans-in-general), (c) wealth/freedom confounds with gender equality. Behavioral / incentivized-measure replications (Falk & Hermle 2018 for economic preferences) cover only part of the domain. The decisive test — non-self-report behavioral replication across personality and interests — is still incomplete. Each candidate mechanism implies different normative conclusions, which is part of why this remains contested rather than resolved.
W6: Flynn-reversal cause (O2) — and a new wrinkle on the positive manifold
Why fragile: The pattern is environmentally driven (within-sibship, Bratsberg & Rogeberg 2018), so “dysgenic” explanations are out. But no mechanism (screen time, education quality, attention, nutrition, lead, microplastics) has been pinned down with within-cohort empirical work. Pietschnig et al. 2024 (Vienna 2005–2018 cohort) added a structural twist: the positive manifold itself may be weakening across cohorts — meaning the recent rise/fall is not uniformly g-loaded. If confirmed broadly, this softly pressures A3 (g exists as a stable dimension) — not refuting it, but suggesting its strength may be cohort-dependent. Still not load-bearing for the integrated picture, but interacts with A3 in a new way.
W7: A6 (dimensional vs. taxonic) at psychiatric extremes
Why fragile: Most psychopathology is dimensional (taxometric evidence is robust), but for severe early-onset autism with intellectual disability, rare large-effect variants (CHD8, SCN2A, SYNGAP1) drive a partly taxonic picture. The “all dimensional” framing oversells continuity at the severe tail.
5. Variant views
The same graph, read four ways.
Variant A: Vulnerability map — where does this break?
The vulnerability map is the union of the three foundational cruxes (§3a), two reframer nodes (§3b), two logical guardrails (§3c), and seven weakest links (§4). Together they describe the smallest set of pressure points whose movement would force restructuring of the integrated picture:
- Falsify A1: SNP-h² systematically <0.05 → twin-method discredited → Section 1 collapses
- Falsify A2: within-family PGS → 0 → modern psychiatric genetics collapses
- Falsify A3: positive manifold dissolves → Section 5 collapses
- Falsify G2: within-family PGS = population PGS → genetic nurture is null → Plomin direct-causal view wins (O1 resolves)
- Falsify G6 fully: AM correction barely changes rg matrix → cross-disorder pleiotropy is real
- Violate L4: cannot be falsified, only ignored — but its violation in public discourse is the largest single source of public confusion
If exactly one of these were to flip, the rebuild would be: A1→ rebuild Section 1 only; A2→ rebuild Sections 1, 3, 7 (~40% of lit review); A3→ rebuild Section 5 (~25%); G2/G6→ keep numbers, rewrite causal interpretation throughout.
Variant B: Flow map — how does causation propagate?
Causation in this system runs in two directions, both important.
Forward developmental flow (genome → outcomes):
Genome (polygenic + few rare large-effect)
│
├──> Temperament (infant biological reactivity: Surgency / NA / EC)
│ │
│ ├──> Active rGE / niche-picking ─────────┐
│ │ │
│ └──> Evocative rGE (eliciting responses) ┤
│ │
└──> Direct expression in brain development ─────┤
▼
Personality (adult)
│
├──> Attainment
├──> Relationships
├──> Health behaviors
└──> Mortality
Indirect / dynastic flow (parents’ genome → offspring environment → offspring outcome):
Parents' genome
│
├──> Parents' phenotype (income, vocabulary, parenting style, neighborhood choice)
│ │
│ └──> Offspring's rearing environment ────────┐
│ ▼
│ Offspring outcomes
│ ▲
└──> Transmitted alleles ───────────────────────────┘
The genetic-nurture finding (E6) says these two pathways have roughly equal magnitude for educational attainment. They are partially separable only via within-family designs (M4) or non-transmitted-allele PGS.
Cross-generational drift via assortative mating:
Mating choice (correlated on phenotype)
│
└──> LD induction among causal variants (G5)
│
├──> Inflated additive genetic variance
├──> Inflated h²
├──> Inflated cross-trait genetic correlations (G6)
└──> Inflated PGS prediction accuracy
Variant C: Minimal claim set — smallest set supporting the conclusion
The smallest collection of claims that yields the integrated picture (S1) is eight nodes:
- E1 — Mean trait h² ≈ 0.49 (heritability is real and substantial)
- E4 — Polygenic architecture (no master genes)
- E6 — Within-family PGS ≈ ½ population PGS (genetic nurture is real)
- E7 — Cross-trait AM is a major source of inflated genetic correlations
- E8 — A small set of large environmental insults have causal effects (lead, iodine, alcohol, deprivation, schooling)
- L4 — Within-pop ≠ between-pop (Lewontin)
- G7 — Stochastic developmental noise is the dominant source of non-shared environment
- A6 — Most psychological variation is dimensional, not taxonic
These eight together generate the qualitative integrated picture without requiring detailed effect-size tables, cross-cultural caveats, or specific candidate-gene history. The remaining ~50 nodes refine and corroborate but do not change the shape.
Variant D: Politicization map — where does motivated reasoning concentrate?
This is the variant most relevant to the topic framing (“a minefield of motivated reasoning on all sides”).
Distortion-to-target matrix:
| Distortion | Targets | Move | Counter-evidence |
|---|---|---|---|
| D1 Blank-slate | A1, E1, E10–E14 | ”Twin studies are flawed; differences are socialization” | SNP-h² (bypasses EEA), MZ-reared-apart, Su 2009 (d=0.93), CAH/primate convergence |
| D2 Hereditarian | L4, E22, E23, O4 | ”Group differences are genetic” | PGS portability collapse (E23); Lewontin (L4); cross-ancestry GWAS at scale don’t exist |
| D3 Gender-similarities | E10, E11, E14 | ”All differences are tiny (cite math d=0.05)“ | Multivariate D=2.71 (E10); people-things d=0.93 (E11); GEP (E14) |
| D4 Pop-evpsych | A6, L7, E10–E14 | ”Men are X, women are Y” (categorical from dimensional) | A6 (dimensional); L7 (effect-size context) |
Why all four distortions can target the same evidence base: the evidence base contains both large differences (people-things d=0.93) and trivial ones (math d=0.05) and strong heritability (h²=0.49) and large environmental insults (lead, schooling) and logical guardrails against between-group inference (L4). Any single-direction narrative requires selective citation. The integrated picture (S1) requires holding all of it at once.
Operational implication for the formalization stage: any model that only parameterizes the variance components without parameterizing the interpretation of those components will be silently captured by whichever distortion the reader is most prone to. The formal model needs to make L4, G2, G6, and the dimensional/taxonic distinction (A6) structurally visible, not just numerically present.
6. Topology → formalization handoff
What the next stage (model formalization) should pick up.
Ready for equations
-
Variance decomposition — fully specifiable now:
V(P) = V(A_direct) + V(A_indirect) + V(A_AM-LD) + V(C_residual) + V(E_measured) + V(E_stochastic) + 2·Cov(G,E) + V(GxE)
With each V parameterized by trait, age, population (US vs. Europe for E25), and method (twin / SNP / within-family). Cov(G,E) captures rGE; V(A_AM-LD) captures G5/G6; V(A_indirect) captures G2.
-
Method gradient (S2): twin h² ≥ SNP h² ≥ within-family h², with the gaps decomposable into AM, rGE, and rare-variant contributions. Parameterize as a function of estimation method.
-
Wilson-effect curve: h²(age) = a + b·log(age) or similar saturation form, with the slope driven by G1 (active rGE). Calibratable from Bouchard 2013 and Briley & Tucker-Drob 2013.
-
Multivariate sex-difference algebra: D² = (μ₁ - μ₂)ᵀ Σ⁻¹ (μ₁ - μ₂), with a worked example showing how D = 2.71 follows from moderate univariate ds and a positive-correlation covariance structure.
-
PGS-portability decay function: prediction accuracy as a continuous function of genetic distance from training population (Ding et al. 2023, Nature: r = −0.95 between genetic distance and accuracy across 84 traits).
Still at observation stage (formalization premature)
- O1 — Plomin/Turkheimer interpretation of PGS: not yet a formal disagreement, just a verbal one
- O3 — GEP mechanism: the algebra of “innate expression release” is not yet specified
- O6 — what non-shared environment is: no candidate decomposition
- O7 — share of cross-disorder rg that survives AM correction: empirical question pending — but methods are now emerging (LAVA-Knock); answer likely in 2–3 years, at which point this moves to “ready for equations”
Connection to adjacent topics in the LLM-iterate pipeline
This topology is the natural input to Parent-to-Child Transmission (planned topic). The genetic-nurture finding (G2/E6) and the dynastic-extension finding (Nivard 2024) are the empirical answers to “how much does parenting matter beyond genes” that the parent-child topic will need to build on. When that topic spins up, the variance decomposition equation here should be its starting point.
Less directly: the Evolution-Modernity Mismatch topic will lean on the GEP (E14/O3) and Flynn-reversal (O2) findings as evidence of environment-driven shifts in expressed psychological variation. Bedrock Generating Functions can read the variance decomposition itself as one such bedrock function.
7. Next moves — three options for Stage 3
The user picks one of these as the primary formalization target. Each leaves the others viable as later modules but shapes Stage 4 (data) differently.
Option A — Variance decomposition + method gradient (most central)
Build the central equation V(P) = V(A_direct) + V(A_indirect) + V(A_AM-LD) + V(C) + V(E_meas) + V(E_stoch) + 2·Cov(G,E) + V(GxE) parameterized by trait, age, population, and estimation method. Build a tool that takes a published h² estimate (twin, SNP, or within-family) and outputs a method-corrected estimate with explicit AM/rGE adjustment.
Pros: most central to the topic; directly answers “what generates psychological variation”; feeds Stage 4 cleanly (every term has published estimates somewhere). Cons: many parameters; risk of producing a calculator nobody uses without strong UI judgment. Stage 4 implication: pull h² estimates from PGC, SSGAC, GIANT consortia; calibrate the method-gradient term per trait class.
Option B — Multivariate sex-difference algebra (most pedagogically clean)
Formalize how moderate univariate Cohen’s ds combine into a large multivariate Mahalanobis D, with a worked Big-Five example showing how D ≈ 2.71 emerges from |d| ≈ 0.4 ds and a positive-correlation Σ. Build a dashboard letting the user dial univariate ds and the correlation matrix to see D move.
Pros: tightly scoped; resolves the single biggest framing trap in the GEP debate (univariate vs. multivariate framings of the same data); high pedagogical leverage. Cons: narrower than A; doesn’t engage the heritability core. Stage 4 implication: pull effect-size matrices from Del Giudice 2012 and Schmitt 2008 cross-cultural data; replicate D under different correlation structures.
Option C — PGS-portability calibration (most practically useful)
Turn Ding et al. 2023’s continuous decay finding into a usable accuracy estimator: enter an individual’s genetic distance from the PGS training population and get an accuracy-decay multiplier. Apply across the major trait PGSs (EA, SCZ, BMI, etc.).
Pros: directly addresses a real-world bias; smallest scope; ships fastest; useful even outside this project’s domain. Cons: less central to the heritability question; might fit better as a tool than a topic-stage. Stage 4 implication: pull cross-ancestry GWAS validation data from the All of Us / GenomeAsia / H3Africa consortia.
My recommendation: A as primary (most central to the topic’s stated purpose), with B as a stretch module if scope allows. C is high-value but might better live as a standalone tool promoted to /models later.
8. Objections to this topology (adversarial + steelman)
Four ways a careful reader could push back. The strongest version of each, then my response.
Objection 1 — Discrete typed edges falsify a continuous, magnitude-weighted, context-conditional system
Heritability is not “supported by” a twin study in the same binary way that a logical implication holds. The system is a tightly coupled developmental process; flattening it into nodes-and-arrows with discrete edge types loses information about magnitude, conditional dependence, and gradient relationships.
Response: Acknowledged, and intentional. The topology is the qualitative skeleton; edge weights and conditional dependencies are the job of Stage 3 (formalization), where each edge will be turned into a parameterized function. The graph’s value is not that it stands in for the full system but that it makes the structure visible cheaply enough that the formalization knows where to put the parameters.
Objection 2 — The crux/decorative split is editorial, not empirical
There is no algorithm that picks crux nodes; the choice depends on which failure modes you are worried about. A 1990s topology of this field would have crowned candidate-gene findings as cruxes. Naming A2 (GWAS signal real) a crux today is a judgment call about the field’s current methodological commitments — not an objective feature of the science.
Response: Correct. Cruxes are time-stamped. This topology is a 2026 snapshot. If the field shifts (post-AM-correction era, post-within-family-PGS-at-scale era) the crux set will shift — that is what the refinement passes are for. Use this as a current map, not an immutable structural claim.
Objection 3 — Calling L4 a “logical firewall” overstates the case
Lewontin’s 1970 argument has been challenged. Edwards (2003) “Lewontin’s Fallacy” showed that Lewontin’s specific quantitative point — that ~85% of human genetic variance is within rather than between populations — does not preclude reliable population-classification from genetic markers. Modern population genetics treats between-population genetic inference as more nuanced than the firewall framing suggests.
Response: The Edwards critique is real, but it addresses a different claim. Edwards refuted “you cannot reliably classify individuals into populations from genetic data.” The L4 firewall as I formulate it says “within-population heritability provides no information about between-population mean differences without strong auxiliary assumptions about shared causal architecture and equal environments.” Those are different propositions. PGS portability collapse (E23) is the contemporary empirical evidence that the auxiliary assumptions are not currently being met for psychological traits. The firewall framing survives the Edwards critique; its strength rests on the empirical PGS-portability finding, not on the original Lewontin variance argument alone.
Objection 4 — The Politicization variant is meta-commentary, not topology
The D nodes and attacks edges describe how people misuse the evidence base. That is epistemics or sociology of science, not structural topology of the field. A pure topology should omit them.
Response: Fair, and the inclusion is non-orthodox. It is justified here only by the topic framing — the user’s prompt explicitly described the field as “a minefield of motivated reasoning … where the actual generating functions are obscured by politics.” A topology of just the science would omit the D nodes; a topology that helps a reader navigate the field as it is actually encountered should include them. The D nodes will not be carried into Stage 3 formalization — they exist for navigation, not for downstream computation.
9. Glossary
For readers approaching this from outside the field. Terms appear throughout the lit review and topology; this is the lookup table.
| Term | Meaning |
|---|---|
| h² | Heritability — fraction of trait variance in a population attributable to genetic variation. A population statistic, not an individual one. |
| SNP | Single-nucleotide polymorphism — a single-base difference at a position in the genome where multiple variants exist in the population. |
| GWAS | Genome-wide association study — scans hundreds of thousands of SNPs against a measured trait, looking for statistical association. |
| PGS | Polygenic score — a per-individual sum of trait-associated SNPs weighted by their GWAS effect sizes. Used as a predictor. |
| LD | Linkage disequilibrium — non-random association between alleles at nearby loci, typically because they are inherited together. |
| AM | Assortative mating — partners resemble each other on a trait above chance. xAM = cross-trait AM (e.g., taller-than-average partners with more-educated-than-average). |
| rGE | Gene-environment correlation. Passive (parents transmit genes + correlated environment), evocative (heritable traits elicit responses), active (people select environments matching propensities). |
| GxE | Gene-environment interaction — the same genotype produces different phenotypes in different environments. |
| EEA | Equal environments assumption — the twin-method assumption that MZ and DZ twins are treated similarly enough that any extra MZ phenotypic resemblance reflects genetics, not differential treatment. |
| MZ / DZ | Monozygotic (identical, ~100% shared DNA) / dizygotic (fraternal, ~50% shared DNA) twins. |
| rg | Genetic correlation between two traits — how much the same genetic variants influence both. |
| WGS | Whole-genome sequencing — capturing every base in the genome, including rare variants GWAS misses. |
| g-factor | General factor of cognitive ability — the latent dimension behind the positive manifold (every cognitive test correlates positively with every other). |
| p-factor | Proposed general factor of psychopathology — analogous to g, derived from cross-syndrome correlations. |
| CHC / HiTOP | Cattell-Horn-Carroll cognitive-ability hierarchy / Hierarchical Taxonomy of Psychopathology (a dimensional alternative to DSM). |
| d (Cohen’s d) | Standardized mean difference between two groups, in standard-deviation units. Effect-size labels (small / medium / large) are scale-dependent — see L7 in the node catalog. |
| Mahalanobis D | Multivariate generalization of Cohen’s d — distance between two group means in the geometry of the trait space, accounting for correlation between traits. |
| Within-family design | Comparing siblings or MZ-discordant twins or parent-offspring trios within the same family — controls for between-family confounds (population stratification, AM, passive rGE). |
| Genetic nurture | Effect of parents’ genotype on offspring outcomes via the environment the parents create — including alleles the parent did not transmit. |
10. Stage_outputs convention reference
Raw working drafts from each LLM-iterate stage live at:
stage_outputs/<topic>/<stage>.md
Where <topic> is kebab-case (e.g., human-psych-variation) and <stage> is one of: lit-review, topology, model, data, build. Polished versions move into src/content/ai_research/<topic>/<stage>.mdx with proper frontmatter (title, description, date, status, refinementPass, refinementLog) once ready to publish on the site.
The interactive D3 graph for this topology lives at src/components/research/PsychVariationGraph.tsx and is mounted in src/content/ai_research/human-psych-variation/topology.mdx via client:load.
Formalization of the variance decomposition for psychological traits — direct genetic, indirect (genetic nurture), AM-induced, shared and non-shared environment, stochastic, gene-environment covariance, interactions — parameterized by trait, age, and method.
Pipeline status: not yet generated. Run Step 3 of the LLM Iterate prompt with the topology as input.
What this stage will produce
Primary target (from the topology handoff):
- Variance decomposition equation:
V(P) = V(A_direct) + V(A_indirect) + V(A_AM-LD) + V(C_residual) + V(E_measured) + V(E_stochastic) + 2·Cov(G,E) + V(GxE) - Method gradient: twin h² ≥ SNP h² ≥ within-family h², with the gaps decomposable into AM, rGE, and rare-variant components
- Wilson-effect curve:
h²(age)saturation form driven by active rGE (G1) - Multivariate sex-difference algebra:
D² = (μ₁ - μ₂)ᵀ Σ⁻¹ (μ₁ - μ₂)with worked example showing how D = 2.71 emerges from moderate univariate ds + positive-correlation Σ - PGS-portability decay function: prediction accuracy as a function of genetic distance from training population
Plus an interactive dashboard (slider/number/visual) letting the user dial trait, age, population, and method to see the decomposition shift.
What is not ready for formalization yet
- O1 (PGS interpretation: Plomin vs Turkheimer) — verbal disagreement, not yet a formal one
- O3 (GEP mechanism) — algebra of “innate-expression release” not yet specified
- O6 (what non-shared environment is) — no candidate decomposition
- O7 (AM-correction magnitude across cross-disorder rg matrix) — empirical question pending
Premature math here would mask uncertainty. Stay observational.
Empirical pipeline — pulling published effect sizes, h² estimates, and PGS R² from the consortia (PGC, SSGAC, GIANT) into a single comparable dataset, with code, cleaning notes, and analytical-choice flags.
Pipeline status: not yet generated. Run Step 4 of the LLM Iterate prompt with the model formalization as input.
Useful artifact built from the formalization and data — a tool for someone who wants to understand how and why people differ without being captured by motivated reasoning from any direction.
Pipeline status: not yet generated. Run Step 5 of the LLM Iterate prompt once the data stage is complete.