Lit Review
What the science actually shows about psychological variation — heritability, environmental shaping, gene-environment interaction, sex differences, cognitive capacity. The post-2010 genomic era confirmed mid-20th-century behavior genetics while demolishing the candidate-gene paradigm; assortative mating and genetic nurture are actively rewriting older interpretations.
TLDR
Virtually every measured psychological trait is moderately to substantially heritable, hyper-polygenic, and shaped by environments that are themselves partly genetic in origin. The post-2010 genomic era confirmed the core findings of mid-20th-century behavior genetics while simultaneously demolishing the candidate-gene paradigm that dominated psychiatry from 1996–2010. Twin heritability for psychological traits averages ~49% (Polderman et al. 2015); molecular GWAS increasingly accounts for this through thousands of tiny-effect common variants plus rarer large-effect variants in neurodevelopmental conditions.
A crucial methodological development since 2018 is the recognition that assortative mating and gene-environment correlation systematically inflate GWAS-derived estimates. Border et al. (2022, Science) showed that cross-trait assortative mating alone can account for substantial fractions of reported genetic correlations — including some psychiatric cross-disorder correlations previously attributed to shared biology. Kong et al.’s (2018) “genetic nurture” finding demonstrated that roughly half of population-level polygenic score prediction for educational attainment reflects environmentally-mediated parental effects, not direct genetic causation. These corrections don’t eliminate genetic influence — they reframe it.
The field’s most contested findings are not the ones most disputed in public discourse: heritability is settled science, the “parenting wars” are largely resolved, and the candidate-gene-by-environment literature has collapsed. What remains genuinely open is mechanistic — how genes build minds, why the gender equality paradox exists, what drives the Flynn Effect’s reversal, and whether between-population mean differences have any genetic component (a question currently unanswerable with available methods, not “settled” in either direction). The generating function for psychological variation is not “genes vs. environment” but a tightly coupled developmental system in which genetic predispositions, environments created by genetically-similar parents, assortative mating patterns, stochastic noise, and cultural context are deeply entangled.
This document is structured for someone building a formal model of psychological variation. Each section flags effect sizes, replication status, consensus, live debate, and ideological distortion from any direction.
1. Heritability: The Foundation Finding
The Polderman meta-analysis
Polderman et al. (2015, Nature Genetics) meta-analyzed 50 years of twin research — 17,804 traits, 14.5 million twin pairs, 2,748 publications — and reported a mean heritability across all human traits of 49%. Polderman et al., 2015. For ~69% of traits, simple additive ACE models fit cleanly.
Turkheimer’s Laws and the Fourth Law
Turkheimer’s Three Laws (2000) — all human behavioral traits are heritable; shared family environment is smaller than genes; substantial variance is explained by neither — were extended by Chabris et al. (2015) with the Fourth Law: a typical behavioral trait is associated with very many genetic variants of tiny effect. This emerged from the failure of candidate-gene studies and the polygenic architecture revealed by GWAS.
What heritability actually means (and doesn’t)
Heritability is a population statistic, not an individual one. Saying IQ is 70% heritable does not mean 70% of any person’s IQ comes from genes. It is not deterministic (height is ~80% heritable yet rose ~10cm in 20th-century Europe through nutrition) and not immutable (h² changes with environment — if all environments became identical, h² would approach 1.0). The most common misinterpretation collapses statistical variance partitioning into causal mechanism.
Twin and molecular estimates by domain
| Domain | Twin h² | SNP-h² | Largest GWAS | Loci | Best PGS R² |
|---|---|---|---|---|---|
| Adult IQ / g | 0.70–0.80 | ~0.20 | 269,867 (Savage 2018) | 205 | ~0.05 |
| Educational attainment | ~0.40 | ~0.13 | 3M (Okbay 2022) | 3,952 | 0.12–0.16 |
| Big Five (avg) | 0.40–0.60 | 0.05–0.18 | 449k (Nagel 2018) | 136 (N) | <0.05 |
| Political orientation | ~0.40 | — | — | — | — |
| Religiosity | 0.30–0.45 | — | — | — | — |
| Risk tolerance | ~0.30 | 0.05 | 1M (Karlsson Linnér) | 99 | <0.02 |
| Schizophrenia | 0.60–0.80 | 0.24 | 320k (Trubetskoy 2022) | 287 | 0.07–0.10 |
| Bipolar | 0.70–0.85 | 0.18–0.20 | 414k (Mullins 2021) | 64 | 0.04 |
| MDD | 0.35–0.40 | 0.09 | 807k (Howard 2019) | 102 | 0.02–0.03 |
| ADHD | 0.74 | 0.14 | 225k (Demontis 2023) | 27 | 0.04–0.06 |
| Autism | 0.80 | 0.12 | 46k (Grove 2019) | 5 | <0.03 |
Note: Political orientation and religiosity are included because they are among the few adult traits where shared family environment (C) remains substantial (~20–30%), unlike personality and cognition where C ≈ 0 by adulthood. See Alford, Funk & Hibbing (2005); Hatemi et al. (2014).
The Wilson Effect
Bouchard (2013) documented that IQ heritability rises with age — from ~20% at age 5 to ~80% by adulthood, with shared-environment effects dropping from ~55% in early childhood to roughly zero by adolescence. Bouchard 2013. Briley & Tucker-Drob (2013) explained the mechanism: early genetic effects are amplified across development through gene-environment correlation (niche-picking). Briley & Tucker-Drob 2013. This finding is robust, replicated, and counterintuitive — genetic differences become more expressed as people age into self-selected environments.
The “missing heritability” problem
The gap between twin h² (~0.70 for IQ) and SNP-h² (~0.20) launched a decade of debate. Wainschtein et al. (2022, Nature Genetics) essentially closed it for height: using whole-genome sequencing in 25,465 unrelated individuals, h² recovered to 0.68 when rare and low-LD variants were included. Wainschtein et al. 2022. For psychological traits the same pattern is emerging. The current synthesis: missing heritability is partly real (rare variants, dominance, GxE) and partly artifactual (twin overestimation from assortative mating and rGE, measurement noise).
Assortative mating: a pervasive inflation source
Assortative mating (AM) — the tendency for partners to resemble each other on traits — has emerged as a major methodological concern. People mate assortatively on education (spousal r ≈ 0.40–0.60), IQ (~0.40), personality (~0.10–0.20), height (~0.20), and psychiatric conditions. AM has three consequences for genetic estimates:
- Inflated heritability: AM increases additive genetic variance across generations by creating linkage disequilibrium among causal variants. Most twin studies underestimate heritability by ignoring AM (counterintuitively); GWAS-based SNP-h² may be inflated by AM-induced LD. Border et al. 2022, Nat Commun.
- Inflated genetic correlations: Border et al. (2022, Science) introduced cross-trait assortative mating (xAM) and showed that phenotypic cross-mate correlations explain R² = 74% of the variance in reported genetic correlation estimates. Some psychiatric cross-disorder genetic correlations — previously interpreted as evidence of shared biology — may be largely or entirely attributable to xAM. Border et al. 2022.
- Inflated PGS prediction: Within-family PGS effects are roughly half of population-level effects for educational attainment (Okbay 2022), partly because AM and population stratification inflate between-family comparisons.
Plomin (2022, Behav Genet) argues this is a prediction-vs-explanation distinction: AM inflates causal genetic estimates but doesn’t invalidate PGS as predictors, since AM-induced variance is real population variance. Plomin 2022. This is technically correct but sidesteps the question of why PGS predict — whether through direct genetic causation or through correlated environments created by assortatively-mating parents.
The genetic nurture revolution
Kong et al. (2018, Science) used 21,637 Icelandic probands with parental genotypes to compute polygenic scores from non-transmitted parental alleles. The non-transmitted PGS predicted offspring educational attainment at ~30% the magnitude of transmitted PGS — meaning parental genotypes shape children via environments they create, even for alleles never inherited. Kong et al. 2018. Okbay et al. (2022, EA4, Nature Genetics) confirmed this in 3 million people: within-family direct effects are roughly half the population-level PGS magnitude. Okbay et al. 2022. The implication: GWAS effect sizes for socially-valued traits are inflated by indirect/dynastic effects, and roughly half of what we used to call “genetic transmission” is actually environmentally mediated by genetically-similar parents.
2. Environmental Shaping: Real, But Smaller and Weirder Than Common Sense Suggests
The shared-vs-non-shared distinction is the single most disorienting finding for laypeople. Across hundreds of twin and adoption studies, the shared family environment (C) accounts for ~0% of variance in adult personality and most adult cognition (Plomin & Daniels 1987; Bouchard & McGue 2003). Parental warmth, parenting style, dinner conversations, books in the home — once genetic transmission is controlled, almost none of this leaves a measurable trace on adult personality. Important exceptions where C remains substantial: educational attainment (~20%), antisocial behavior, religiosity (~25%), political orientation (~20–30%), and childhood (but not adult) externalizing.
What “non-shared environment” actually is
Turkheimer & Waldron’s (2000) meta-analysis of measured non-shared environmental predictors found these accounted for only ~2% of variance in outcomes. Plomin’s recent verdict: non-shared environment is “real but largely random,” more akin to stochastic developmental noise — differential peer experiences, illness, idiosyncratic events, measurement error — than systematic experience.
The Equal Environments Assumption
Critics (Joseph, Charney; Fosse et al. 2015) note that MZ twins are treated more similarly than DZ twins. The central empirical defense: Kendler et al. (1993) showed misperceived-zygosity twins had phenotypic similarity tracking true zygosity. Kendler et al. 1993. MZ-reared-apart correlations (Bouchard’s Minnesota study) closely match MZ-reared-together correlations. Felson (2014) reanalysis: EEA is “not strictly valid, but bias is modest.” Modern SNP-based heritability estimates entirely bypass EEA and give somewhat lower but still substantial h². Verdict: EEA is approximately valid; bias is modest (~10–20% inflation) for most traits.
Environmental factors with robust causal effects on cognition
A small number of environmental insults have large, replicated, causal effects — typically asymmetric (removing severe deficits matters more than enrichment above normal):
- Lead exposure: Lanphear et al. (2005) pooled 7 prospective cohorts: blood lead 1→10 µg/dL → −6.2 IQ points, with steeper slope at low concentrations. Causal status: strong.
- Severe iodine deficiency: 8–12 IQ point cost; supplementation recovers ~8.7 points (Bougma 2013; Qian 2005). RCT-supported, globally replicated.
- Heavy prenatal alcohol: FAS produces mean IQ ~70; Mendelian-randomization confirms causation.
- Schooling: Ritchie & Tucker-Drob (2018) meta-analyzed 142 effect sizes / 600,000 participants across three quasi-experimental designs. Each year of education raises IQ by 1–5 points (mean ~3.4), persisting into old age. The most consistent durable IQ-raising intervention identified. Ritchie & Tucker-Drob 2018.
- Air pollution (PM2.5): ~−0.27 IQ points per 1 µg/m³ (Aghaei 2024); smaller per-unit than lead but exposure is widespread.
The Scarr-Rowe interaction (SES × heritability)
Turkheimer et al. (2003) reported that in impoverished families, IQ heritability was ~10% with shared environment ~60%; in affluent families this reversed. Turkheimer et al. 2003. Tucker-Drob & Bates (2016) meta-analysis: replicated in U.S. samples but absent in Western European/Australian samples — likely because more universal healthcare/education reduces environmental variance at the bottom. Status: real but context-dependent and U.S.-specific.
Parenting effects: the Harris correction, partially reversed
Judith Rich Harris (1995; The Nurture Assumption 1998) argued that within-normal-range parenting has minimal long-term effects on adult personality. The empirical core was correct: C ≈ 0 for adult personality. But Harris overstated her case. Korean-American adoption studies (Sacerdote 2007; Beauchamp et al. 2023) show real but modest causal effects of family environment on educational attainment, BMI, drinking, smoking — transmission coefficients ~25% of biological-family magnitude. Severe deprivation/abuse causes clear damage. The accurate position: within the normal Western range, parenting style has small effects on adult personality; family environment has measurable but modest effects on attainment outcomes; severe parenting variation matters substantially.
Neighborhood and peer effects
Chetty & Hendren (2018, QJE) used 5+ million U.S. cross-county movers with sibling fixed effects: each year of childhood exposure to a 1-SD better county raises adult income by ~0.5–0.7%. Chetty & Hendren 2018. Moving to Opportunity reanalysis (Chetty, Hendren & Katz 2016): children moving before age 13 had adult earnings 31% higher than controls. Place matters, but accumulates slowly across many years of exposure.
The Flynn Effect and its reversal
Flynn (1984, 1987) documented ~3 IQ points/decade gains across the 20th century. Causes contested: nutrition, schooling, infectious-disease reduction, test sophistication, smaller families — no single mechanism established. Bratsberg & Rogeberg (2018, PNAS) used Norwegian within-family conscript data to demonstrate that both the Flynn Effect and its post-1990s reversal are environmentally driven (visible within sibships, ruling out dysgenic/compositional explanations). Bratsberg & Rogeberg 2018. Similar declines now reported in Denmark, Finland, the Netherlands, France, the UK, and Germany. The reversal’s cause is unknown — this is one of the field’s most important open questions.
3. Gene-Environment Interplay: rGE Wins, Candidate-GxE Collapsed
Three types of gene-environment correlation
The Plomin/DeFries/Loehlin (1977) framework distinguishes passive rGE (parents transmit both genes and correlated rearing environment), evocative rGE (heritable child traits elicit specific responses), and active rGE / niche-picking (individuals select environments matching genetic propensities). Kendler & Baker’s (2007) systematic review shows essentially every measured environment is itself heritable (15–35%) — meaning observational claims like “parental warmth causes child outcomes” are confounded by passive rGE. The genetic nurture and within-family PGS results (Section 1) quantify this: population-level “genetic” prediction is roughly half indirect environmental effects of genetically-similar parents.
The candidate gene × environment collapse
Caspi et al. (2003, Science) reported that 5-HTTLPR short-allele carriers showed elevated depression risk under stress. The paper became one of the most cited in psychiatry (>9000 citations). It collapsed:
- Risch et al. (2009, JAMA): meta-analysis of 14 studies, N=14,250 — no evidence. Risch et al. 2009.
- Culverhouse et al. (2018): pre-registered collaborative meta-analysis, 31 datasets, N=38,802 — definitively no evidence.
- Border et al. (2019, Am J Psychiatry): examined 18 most-studied depression candidate genes in N up to 443,264. No clear evidence for any candidate gene polymorphism on depression. As a set, candidate genes were no more associated with depression than non-candidate genes. Border et al. 2019.
- Duncan & Keller (2011): 96% of novel candidate-GxE studies were significant; only 27% of replication attempts were. Duncan & Keller 2011.
MAOA × maltreatment (Caspi et al. 2002) is the partial exception that survived meta-analysis (Byrd & Manuck 2014) — modest male-specific interaction, but smaller than originally reported.
Differential susceptibility / orchid-dandelion
Belsky & Pluess (2009) reframed “risk alleles” as “plasticity alleles” — some individuals are more reactive to environments “for better and for worse.” Belsky & Pluess 2009. The theory is generative; the empirical record is mixed. Recent systematic reviews find that interactions between child characteristics and parenting rarely replicate across cohorts and developmental domains. Distinguishing differential susceptibility from diathesis-stress requires very large, preregistered samples. de Villiers et al. 2018.
Epigenetics: real biology, oversold psychology
DNA methylation, histone modifications, and non-coding RNA regulation are real, well-characterized mechanisms important in development. The controversy concerns whether environmentally-induced epigenetic marks are faithfully transmitted across generations in humans. They generally are not.
- Heard & Martienssen (2014, Cell): in mammals, two waves of near-complete epigenetic reprogramming erase most acquired methylation marks. Robust transgenerational epigenetic inheritance occurs in plants and C. elegans; in humans it remains largely speculative. Heard & Martienssen 2014.
- Dutch Hunger Winter (Heijmans et al. 2008): real within-individual epigenetic effect persisting decades, not evidence of transmission to grandchildren.
- Yehuda’s Holocaust FKBP5 study (2016): tiny sample (n=8 control parents), opposite-direction effects in parents vs. offspring, no germline measurement. Yehuda’s own group failed to replicate. The “trauma is inherited epigenetically” narrative is not supported by current evidence.
Critical periods: solid developmental neuroscience
Hensch (2005, Nat Rev Neurosci) provides a mechanistically rigorous account of cortical critical-period plasticity. Hensch 2005. GABAergic maturation (parvalbumin-positive interneurons) gates onset; perineuronal nets and myelin-associated inhibitors close periods. This represents the high end of how environmental experience shapes brain structure — genuine, replicated, and mechanistically understood.
4. Sex and Gender Differences: Large Where You’re Not Told They Are
Sex differences are one of psychology’s most ideologically distorted areas — distorted by both minimization and overstatement. The actual picture: small differences in average cognitive ability, large differences in interests and physical aggression, moderate-to-large multivariate personality differences, and a robust but mechanistically contested gender equality paradox.
Cognitive abilities
Mental rotation shows d ≈ 0.56–0.73 male advantage (Voyer et al. 1995), among the largest cognitive sex differences documented. Mean math performance: d ≈ 0.05–0.10 (Lindberg et al. 2010) — essentially no average difference. Writing: substantial female advantage. School grades favor girls overall (Voyer & Voyer 2014). At extreme tails (95th–99th percentile) males outnumber females ~2:1 in many countries — driven by slightly greater male variance (~3–15% higher) compounding at extremes.
Personality: univariate vs. multivariate framing
Univariate Big Five differences are moderate: women higher on Neuroticism (d ≈ 0.40) and Agreeableness (d ≈ 0.40). Del Giudice, Booth & Irwing (2012) computed multivariate Mahalanobis D = 2.71 on 16PF data from 10,261 Americans, implying ~10% overlap between male and female personality profiles. Del Giudice et al. 2012. Hyde’s (2005) “Gender Similarities Hypothesis” — most differences trivial or small — is mathematically compatible but tells a very different qualitative story. Both univariate and multivariate framings should be reported jointly; selective use is ideological.
Interests: the largest sex difference in psychology
Su, Rounds & Armstrong (2009, Psych Bulletin) meta-analyzed 503,188 people: the People-Things dimension d = 0.93, with engineering interest d = 1.11. Su et al. 2009. These are very large by psychological standards and the largest in the entire literature on psychological sex differences.
Aggression
Archer (2004): physical aggression d ≈ 0.40–0.60 male; trait anger near zero. Males commit ~95% of homicides globally. Archer 2004. Indirect/relational aggression: Card et al. (2008) found differences trivial (d < 0.10), challenging the “girls do indirect aggression equally” narrative.
The Gender Equality Paradox (replicated; mechanism contested)
A robust empirical pattern across at least four domains: personality, preference, interest, and depression-rate differences are larger in more gender-equal and wealthier countries.
- Schmitt et al. (2008): 55-nation Big Five study — differences largest in egalitarian Western cultures. Schmitt et al. 2008.
- Falk & Hermle (2018, Science): 80,000 adults, 76 countries — sex differences in 6 economic preferences positively related to GDP and gender equality.
- Stoet & Geary (2018): STEM Gender-Equality Paradox — more gender-equal countries had smaller female share of STEM graduates. A corrigendum addressed methods; the core correlation remained robust.
The correlation is robust. The causal mechanism — innate-expression release in wealthy environments vs. measurement artifacts vs. ecological confounds — is genuinely contested.
Mental health asymmetries
Depression female:male ≈ 2:1; anorexia ~10:1 female; ADHD diagnosis ~2–3:1 male; antisocial personality, substance use, completed suicide all male-skewed; autism ~3–4:1 male; schizophrenia roughly equal but more severe early-onset in males.
Biological mechanisms
CAH girls (prenatally elevated androgens) show masculinized toy preferences and play patterns (Kung et al. 2024 meta-analysis). Same-sex-typed toy preferences in vervet and rhesus monkeys parallel human findings, supporting partial biological mediation. Wood & Eagly’s social role theory faces empirical challenge from the gender equality paradox.
5. Cognitive Ability and Intelligence
The g-factor
Spearman’s 1904 finding of a positive manifold — every cognitive test correlates positively with every other — is arguably the most replicated finding in psychology. A first unrotated principal factor captures 40–50% of variance in any sufficiently broad battery. van der Maas et al. (2006) mutualism model offers an alternative: g may be an emergent network property of reciprocally beneficial cognitive processes during development, not a unitary biological cause. van der Maas et al. 2006. Most working researchers treat g as a robust statistical regularity whose causal architecture is unsettled.
Structure: CHC theory
Carroll’s (1993) three-stratum theory — g at top, ~8–10 broad abilities (Gf, Gc, Gv, Ga, Gs, Gsm, Glr, Gq, Grw), ~70+ narrow abilities — was integrated with Cattell-Horn into the Cattell-Horn-Carroll (CHC) framework, which underlies modern IQ tests.
Predictive validity
Schmidt & Hunter (1998): corrected GMA validity for job performance r ≈ 0.51. Sackett et al. (2022) argued corrections were too aggressive; re-estimate: r ≈ 0.31 uncorrected / ~0.42 corrected. GMA remains among the most predictive selection tools. Childhood IQ predicts educational attainment at r ≈ 0.50–0.70. Calvin et al. (2011) meta-analysis (1.1M, 22,453 deaths): each 1-SD higher childhood IQ → ~24% lower all-cause mortality. Calvin et al. 2011.
Lifespan stability
Lothian Birth Cohort: age 11 → age 90 corrected correlation r ≈ 0.67. Deary et al. 2013. About one-third of variance in mental ability at 90 is accounted for by ability at 11.
Group differences in test scores: the most distorted area
Roth et al. (2001) meta-analysis (N=6.2M): U.S. Black-White cognitive ability gap d ≈ 1.0 (~15 IQ points). Dickens & Flynn (2006): Black IQ rose 4–7 points relative to whites between 1972–2002 (about one-third of the gap). Dickens & Flynn 2006. The gap exists, has narrowed somewhat, and has not closed.
The mainstream contemporary position (Nisbett et al. 2012; Turkheimer, Harden & Nisbett 2017): within-group heritability does not license between-group inferences (Lewontin’s point); Martin et al. (2019, Nature Genetics) demonstrated PGS lose ~4.5x prediction accuracy in African-ancestry individuals due to differential LD and allele frequencies, meaning current PGS cannot validly compare mean genetic predisposition across continental ancestry groups. Mostafavi et al. (2020) showed PGS portability also breaks down within Europeans across SES strata.
The honest scientific position: gaps in test scores are real, partly narrowing, and their causes are not currently identifiable as genetic, environmental, or both — direct evidence is absent and mainstream geneticists treat the question as not currently answerable.
Distortion from the hereditarian direction: treating g-loadedness as evidence of genetic etiology (environmental causes can also be g-loaded); citing fringe admixture studies published in weak-peer-review venues; conflating absence of evidence with agnosticism. Distortion from the environmentalist direction: claiming gaps have closed when they only partly narrowed; dismissing IQ as “culturally biased” despite measurement-invariance evidence; overstating stereotype threat (Flore & Wicherts 2015 meta-analysis showed publication-biased modest effects).
Brain correlates
Brain volume × IQ: r ≈ 0.24 (Pietschnig et al. 2015, 2022). P-FIT theory (Jung & Haier 2007): intelligence supported by parieto-frontal network. Jung & Haier 2007.
Creativity and intelligence
The “IQ ≈ 120 threshold” hypothesis is largely disconfirmed (Weiss et al. 2020). Intelligence and creativity correlate ~r = 0.20–0.30 across the range. Openness to Experience is the personality trait most reliably correlated with creative achievement (~0.30–0.40).
6. Personality and Temperament
The Big Five (OCEAN) and HEXACO
The Big Five emerged from the lexical hypothesis. Heritability is ~40–60% per twin studies; SNP-h² is 8–18%. Nagel et al. (2018) identified 136 loci for neuroticism in 449,484 people. Nagel et al. 2018. The ReGPC consortium (2025) reports 703 loci for neuroticism in 1M+ participants. ReGPC 2025.
Roberts & DelVecchio (2000): rank-order stability rises from ~0.31 in childhood to ~0.74 by midlife (cumulative continuity). Roberts & DelVecchio 2000. Roberts et al. (2006) the maturity principle: mean-level increases in Conscientiousness, Agreeableness, and Emotional Stability with age, especially in young adulthood. Bleidorn et al. 2022 update.
HEXACO (Ashton & Lee): lexical studies in 12+ languages consistently yield six factors, the sixth being Honesty-Humility. H predicts integrity-related criteria incrementally over Big Five. Ashton & Lee 2008.
Temperament: the developmental foundation
Temperament research constitutes a parallel tradition to adult personality, focused on biologically-grounded individual differences emerging in infancy.
Rothbart’s model identifies three overarching dimensions: Surgency/Extraversion (activity, positive affect, approach), Negative Affectivity (fear, anger, sadness, discomfort), and Effortful Control (attentional regulation, inhibitory control, low-intensity pleasure). Effortful Control is particularly important — it is the self-regulatory component of temperament, developing primarily during ages 2–7 as the anterior attention network matures, and is a strong predictor of later externalizing problems, academic success, and conscience development.
Kagan’s Behavioral Inhibition (BI) framework focuses on extreme phenotypes: ~15–20% of infants show high-reactive patterns (vigorous motor activity and distress to novel stimuli at 4 months) who become behaviorally inhibited toddlers — cautious, avoidant with unfamiliar people and situations. BI maps approximately onto low Surgency + high Negative Affectivity (especially fear). Kagan’s longitudinal studies showed BI is moderately heritable (~50%), associated with higher resting heart rate and amygdala excitability, and predicts elevated risk for social anxiety disorder in adolescence (OR ~2–4). However, ~60% of high-reactive infants do not become clinically anxious adults — biology is a foundation, not a constraint.
Thomas & Chess’s (1977) “goodness of fit” model — later empirically supported — emphasized that temperamental difficulty per se doesn’t predict poor outcomes; the match between child temperament and environmental demands does.
The temperament → personality continuity is increasingly well-documented: infant Surgency maps onto adult Extraversion; infant Negative Affectivity onto Neuroticism; infant Effortful Control onto Conscientiousness. The mapping is imperfect — adult personality includes social-cognitive layers (identity, values, narrative) absent in temperament.
Cross-cultural universality
McCrae & Terracciano (2005): clean Big Five replication in 50 cultures. McCrae & Terracciano 2005. Gurven et al. (2013) challenged this with the Tsimane forager-horticulturalists, where the full Big Five did not robustly emerge. Gurven et al. 2013. Consensus: 3 factors (E, A, C) replicate cross-linguistically; the full Big Five replicates well in Indo-European languages; non-WEIRD samples sometimes show structural deviations.
Dark traits and the D factor
Paulhus & Williams (2002): the Dark Triad (Machiavellianism, narcissism, psychopathy). Buckels, Jones & Paulhus (2013) added everyday sadism. Moshagen, Hilbig & Zettler (2018) proposed the D factor — a general tendency to maximize individual utility while disregarding others — as the common core, mapping strongly onto low Honesty-Humility. Moshagen et al. 2018.
Person-situation debate: resolved
The Mischel (1968) critique — cross-situational consistency rarely exceeds r ≈ 0.30 — was resolved through aggregation, interactionism (Mischel & Shoda’s CAPS model), and Fleeson’s within-person variability framework. The modern consensus: persons, situations, and their interactions all matter.
Personality predicts outcomes as strongly as IQ and SES
Roberts et al. (2007) “The Power of Personality”: meta-analytic comparison shows personality effects on mortality, divorce, and occupational attainment are indistinguishable in magnitude from SES and cognitive ability effects. Roberts et al. 2007. Conscientiousness predicts mortality through health behaviors with large effect size. Bogg & Roberts 2004.
Recent theoretical developments
DeYoung’s Cybernetic Big Five Theory (2015): traits as parameters of a cybernetic goal-pursuit system. DeYoung 2015. Mõttus et al. (2017): “personality nuances” research argues item-level traits capture incremental valid variance below the facet level. Mõttus et al. 2017.
7. Neurodiversity and Psychopathology: Dimensional, Polygenic, Transdiagnostic
The genomic era has produced three conclusions that fundamentally reshape psychiatric nosology: all major psychiatric conditions are highly heritable, hyper-polygenic, and substantially genetically overlapping across diagnostic categories.
Headline findings by disorder
- Schizophrenia: twin h² ~80%; Trubetskoy et al. (2022, Nature): 287 loci; SNP-h² ~24%. Trubetskoy et al. 2022. Environmental risk factors: urban birth (~2× risk), high-potency cannabis (OR ~3.9), migration, obstetric complications.
- Bipolar: twin h² ~70–85%; Mullins et al. (2021): 64 loci; rg(SCZ,BD) ~0.7. Mullins et al. 2021.
- Major Depression: h² ~37%; Howard et al. (2019): 102 loci; SNP-h² ~9%. Howard et al. 2019. Strong rg with neuroticism (~0.7).
- ADHD: twin h² ~74%; Demontis et al. (2023): 27 loci. Demontis et al. 2023. Negative rg with educational attainment and IQ.
- Autism: twin h² ~80%; Grove et al. (2019): 5 common-variant loci plus substantial rare/de novo variants of large effect (CHD8, SCN2A, SYNGAP1). Grove et al. 2019. Common-variant PGS positively correlated with IQ and education; ID-comorbid autism (rare-variant-driven) negatively correlated.
Cross-disorder pleiotropy (with assortative mating caveat)
Brainstorm Consortium (2018, Science): substantial genetic correlations among psychiatric disorders. Cross-Disorder PGC (Lee et al. 2019, Cell): across 8 disorders, 109 pleiotropic loci, three clusters — compulsive, mood/psychotic, early-onset neurodevelopmental. Lee et al. 2019.
Critical caveat: Border et al. (2022, Science) showed that cross-trait assortative mating can generate spurious genetic correlations between phenotypes with entirely distinct genetic bases. Some fraction of reported psychiatric cross-disorder genetic correlations may reflect xAM rather than shared biology. The magnitude of this artifact is actively being quantified and represents a major revision in progress.
The p-factor
Caspi et al. (2014): a single p (general psychopathology) factor fit Dunedin cohort data better than three-factor models — analogous to g for cognitive ability. Caspi et al. 2014. Higher p associated with greater impairment, familiality, worse developmental histories. Replicated in dozens of samples. Interpretations contested: genuine common liability, statistical artifact of bifactor over-extraction, or a reflection of impairment/distress per se.
Dimensional alternatives: HiTOP and RDoC
HiTOP (Kotov et al. 2017): a quantitatively-derived dimensional alternative to DSM organized hierarchically. RDoC (NIMH 2009–): six dimensional neurobiologically-grounded research domains. Both converge with taxometric evidence (most psychopathology is dimensional, not taxonic) on the dimensional turn in psychiatric science.
Polygenic scores in clinics: not yet
Best PGS R² ~7–10% for schizophrenia. PGS alone does not outperform family history. PGS performance drops 50–70% in non-European-ancestry populations — a major equity and portability problem.
The neurodiversity framework: scientific–identity tensions
Coined by Singer (1998), the neurodiversity paradigm reframes autism, ADHD, dyslexia as natural variation rather than pathology. The framework has legitimate ethical force but operates in tension with deficit-oriented findings for severe presentations (profound autism with ID, epilepsy, self-injury). A defensible position recognizes both the reality of impairment at the severe end and the population-level continuous variation that grades into normality.
8. Key Researchers and Labs
| Researcher | Affiliation | Central contribution |
|---|---|---|
| Robert Plomin | King’s College London | Behavioral genetics synthesis; Blueprint; GPS |
| Eric Turkheimer | University of Virginia | Three Laws; Scarr-Rowe; philosophical foundations |
| K. Paige Harden | UT Austin | Genetic Lottery; causal inference with PGS |
| Avshalom Caspi / Terrie Moffitt | Duke / King’s | p-factor; Dunedin cohort; (and candidate-GxE) |
| Ian Deary | Edinburgh | Lothian Birth Cohorts; cognitive epidemiology |
| Elliot Tucker-Drob | UT Austin | Education-IQ meta-analysis; Wilson Effect mechanisms |
| Daniel Benjamin / SSGAC | UCLA | EA GWAS consortium; social-science genomics |
| Colin DeYoung | Minnesota | Cybernetic Big Five Theory; personality neuroscience |
| Jay Belsky | UC Davis | Differential susceptibility |
| Marco Del Giudice | UNM | Multivariate sex differences |
| Janet Hyde | Wisconsin | Gender similarities hypothesis |
| David Geary | Missouri | Sex differences in math/STEM |
| Brent Roberts | UIUC | Personality development; maturity principle |
| Alexander Young | UCLA | Genetic nurture; within-family methods |
| Richard Border | Harvard/UCLA | Candidate gene demolition; xAM |
| Peter Hatemi / John Hibbing | Penn State / Nebraska | Genopolitics; heritability of political attitudes |
9. The Integrated Picture: What Generates Psychological Variation
The model
A formal model of individual psychological variation should treat the person as the joint product of:
(a) A hyper-polygenic genome encoding thousands of small-effect predispositions (plus some rare large-effect variants in neurodevelopmental conditions). Twin h² for most traits falls in 0.40–0.80.
(b) Substantial gene-environment correlation through passive (parents transmit genes + correlated environments), evocative (child traits elicit responses), and active (niche-picking) channels. Roughly half of population-level PGS prediction reflects indirect/environmental mediation by genetically-similar parents, not direct genetic causation.
(c) Assortative mating inflating additive genetic variance, genetic correlations between traits, and PGS prediction accuracy. This is a recently-quantified source of systematic bias in nearly all genetic estimates.
(d) A small set of large-effect environmental insults — lead, severe iodine deficiency, heavy prenatal alcohol, severe deprivation — plus schooling (~3.4 IQ points/year). Effects are typically asymmetric: removing severe deficits matters more than enriching above-normal environments.
(e) Substantial stochastic developmental noise — the dominant source of the non-shared environment, which accounts for ~50% of personality variance and is not yet well-characterized mechanistically.
(f) Cultural/institutional contexts that modulate which genetic predispositions are expressed and rewarded (WEIRD effects, gender equality paradox, Scarr-Rowe interaction, Flynn Effect).
(g) Developmental unfolding across time — temperament in infancy (biologically grounded reactivity and regulation) becomes personality in adulthood (adding social-cognitive layers), with heritability increasing across the lifespan (Wilson Effect) and rank-order stability rising to ~0.74 by midlife.
Where political distortion is strongest, by direction
From the environmentalist/blank-slate direction: dismissing twin study validity wholesale; overstating Scarr-Rowe; promoting transgenerational epigenetic narratives that exceed evidence; dismissing IQ as culturally biased despite measurement-invariance findings; overstating stereotype-threat magnitudes; minimizing the gender equality paradox.
From the hereditarian direction: citing within-population heritability to license between-population genetic inferences; citing fringe admixture studies as if mainstream; treating g-loadedness of gaps as evidence of genetic etiology when environmental causes can also be g-loaded; ignoring the assortative mating and genetic nurture corrections to PGS.
From the “gender similarities” direction: selective citation of d ≈ 0.05 for math to imply no differences anywhere; obscuring multivariate D ≈ 2.71 with univariate framing; minimizing d ≈ 0.93 people-things interest differences.
From popular evolutionary psychology: treating dimensional differences as taxonic; extrapolating from small ds to categorical claims; overgeneralizing from specific tasks to broad domain claims.
Open questions worth modeling
- Mechanistic interpretation of PGS: Plomin’s “causal genetic” view vs. Turkheimer’s “weak genetic explanation” — genuinely open.
- Flynn Effect reversal: cause unknown; one of the most important open questions in differential psychology.
- Gender equality paradox mechanism: innate-expression release vs. measurement artifacts vs. wealth confounds — unsettled.
- Between-population cognitive differences: currently scientifically unanswerable (PGS portability too poor; cross-ancestry GWAS at scale don’t exist). Honest position: unresolved, not settled in either direction.
- The causal architecture of g: latent common cause vs. emergent network property (mutualism) — the positive manifold is not in dispute; what generates it is.
- What non-shared environment actually is: stochastic noise, epigenetic variation, immune/microbial variation, differential peer networks — largely uncharacterized despite accounting for ~50% of personality variance.
- Assortative mating correction magnitudes: how much do AM and xAM corrections change the substantive picture of genetic architecture and cross-trait pleiotropy? Active area of revision.
10. Load-Bearing Assumptions and Falsification Conditions
This section makes explicit which conclusions in this review depend on which assumptions, and what evidence would substantially revise or flip them. Ordered roughly by how much of the document’s picture collapses if the assumption fails.
Assumption 1: The twin method provides approximately valid variance decomposition
What depends on it: Nearly all h² estimates in Section 1’s table, the C ≈ 0 finding for adult personality, the Wilson Effect, the Scarr-Rowe interaction.
Status: Approximately valid. SNP-h² estimates (which bypass EEA entirely) give lower but still substantial heritability for every trait measured. MZ-reared-apart designs converge with MZ-reared-together. Felson (2014) estimates ~10–20% EEA-induced inflation, not enough to eliminate the core finding.
What would flip it: SNP-h² for psychological traits systematically converging on <0.05 (would suggest twin h² is mostly EEA artifact). Or: a large, well-powered MZ-reared-apart study finding IQ correlations <0.40 (current estimates ~0.70). Neither has occurred.
Robustness verdict: HIGH. The convergence of twin, adoption, and molecular methods on moderate-to-substantial heritability is the most replicated finding in the field.
Assumption 2: GWAS identifies real genetic signal (not just population structure and AM artifacts)
What depends on it: The entire PGS enterprise, genetic nurture estimates, cross-disorder pleiotropy findings, the “missing heritability” narrative.
Status: Substantially valid but with known inflation. Within-family PGS effects are non-zero for educational attainment (~half of population effects), meaning direct genetic signal exists. But the magnitude of AM and stratification inflation is still being quantified.
What would flip it: Within-family PGS effects for most traits converging on ~zero (would mean population-level PGS prediction is entirely indirect/environmental). Current evidence: within-family effects are reduced but clearly non-zero for EA, BMI, height; less well-characterized for personality and psychiatric traits.
Robustness verdict: MODERATE-HIGH for the existence of direct genetic effects; MODERATE for their precise magnitude, which is actively being revised downward.
Assumption 3: g is a real dimension of individual variation (not a measurement artifact)
What depends on it: The entire intelligence section (Section 5), predictive validity claims, group-difference discussions, the CHC structure.
Status: The positive manifold is among the most replicated findings in psychology. Whether g is a latent common cause or an emergent network property (mutualism) is unsettled, but both interpretations preserve g’s predictive validity and the meaningfulness of individual differences in general cognitive ability.
What would flip it: A sufficiently broad, well-constructed cognitive battery where the first principal component explains <15% of variance (would undermine the positive manifold). Or: successful interventions that consistently raise one cognitive ability while lowering others (would violate the manifold’s structure). Neither has been demonstrated.
Robustness verdict: HIGH for g as a statistical regularity with predictive validity. MODERATE for g as a unitary biological mechanism (mutualism remains a viable alternative).
Assumption 4: Sex-difference effect sizes from meta-analyses are not primarily measurement artifacts
What depends on it: The gender equality paradox, the claim that interest differences (d = 0.93) are among psychology’s largest, the multivariate personality finding (D = 2.71).
Status: Interest measures (Su et al. 2009) use well-validated instruments; the d = 0.93 holds across inventories and cultures. The Del Giudice multivariate D is sensitive to the number of variables included and the specific battery, though the qualitative finding (large multivariate difference despite moderate univariate ds) is robust across datasets. CAH and non-human primate evidence provides independent convergent support for biological mediation of interest differences.
What would flip it: A large cross-cultural study using behavioral (not self-report) interest measures finding d < 0.30 for people-things. Or: evidence that the gender equality paradox disappears when using non-self-report personality measures (reference-group effects could inflate self-report differences in egalitarian countries). Current evidence: Falk & Hermle (2018) used incentivized behavioral measures for some preferences and found the paradox held, but full behavioral replication across all domains is incomplete.
Robustness verdict: HIGH for the existence of substantial sex differences in interests and aggression. MODERATE for the precise magnitude of multivariate personality differences. MODERATE for the gender equality paradox’s causal interpretation.
Assumption 5: The candidate-GxE collapse generalizes — specific gene × environment interactions are mostly small or nonexistent
What depends on it: Section 3’s dismissal of 5-HTTLPR and similar findings, the shift toward polygenic approaches.
Status: For candidate genes, the collapse is definitive (Border et al. 2019). But this does not necessarily mean polygenic-score × environment interactions are also null. PGS × environment work is younger, uses better methods, and could in principle yield robust results.
What would flip it: Multiple large, pre-registered PGS × measured-environment studies showing robust, replicable interactions explaining >5% of variance. Current evidence: a few suggestive findings (PGS-for-education × compulsory schooling reforms) but nothing approaching the scale or replication needed for confidence.
Robustness verdict: HIGH for the candidate-gene collapse. LOW-MODERATE confidence in the broader claim that specific GxE interactions are generally small — this is an extrapolation from the candidate-gene failure, and the polygenic GxE literature is too young to draw strong conclusions.
Assumption 6: Cross-disorder genetic correlations reflect shared biology (pleiotropy)
What depends on it: The p-factor interpretation, HiTOP structure, the “dimensional turn” in psychiatry, transdiagnostic treatment rationales.
Status: Substantially challenged by Border et al. (2022). Cross-trait assortative mating can generate spurious genetic correlations between traits with entirely distinct genetic bases. The R² = 74% finding means most of the variance in genetic correlation estimates tracks spousal phenotypic correlations — though this does not prove all genetic correlations are spurious (some genuine pleiotropy surely exists).
What would flip it: Within-family designs showing that cross-disorder genetic correlations survive AM correction at >50% of current estimates. Or: identification of specific shared biological pathways (e.g., synaptic pruning variants affecting both SCZ and BD) that don’t depend on LD induced by AM.
Robustness verdict: MODERATE. The dimensional/transdiagnostic pattern is likely real but inflated. The magnitude of genuine pleiotropy vs. AM artifact is one of the field’s most active methodological debates.
11. Toward Topology: Structure for the Next Phase
This section identifies the natural graph/network structure embedded in this literature, to facilitate the transition from landscape analysis to formal topology mapping.
Natural node types
- Trait nodes: Cognitive abilities (g, Gf, Gc, Gv, Gs…), personality dimensions (Big Five/HEXACO factors and facets), temperament dimensions (Surgency, Negative Affectivity, Effortful Control), psychopathology spectra (internalizing, externalizing, thought disorder), interests (people-things, RIASEC), political/moral attitudes
- Mechanism nodes: Genetic architecture (common polygenic, rare large-effect, de novo), environmental factors (lead, iodine, schooling, deprivation, neighborhoods), developmental processes (critical periods, niche-picking, genetic nurture, AM), stochastic noise
- Method nodes: Twin studies, adoption studies, GWAS, PGS, within-family designs, Mendelian randomization, meta-analysis
- Population-level modifier nodes: SES (Scarr-Rowe), culture (WEIRD), gender equality index, historical period (Flynn Effect)
Natural edge types
- Genetic correlations (with AM caveat): e.g., rg(SCZ, BD) ≈ 0.7; rg(EA, IQ) ≈ 0.7; rg(neuroticism, MDD) ≈ 0.7
- Developmental continuity: temperament → personality (Surgency → Extraversion; Effortful Control → Conscientiousness)
- Causal environmental effects: lead → IQ (−6.2 pts per 10 µg/dL); schooling → IQ (+3.4 pts/year)
- Predictive validity edges: g → job performance (r ≈ 0.42); Conscientiousness → mortality; EA PGS → income
- Methodological dependency: twin h² → SNP-h² → PGS R² (each constraining the next)
- Taxonomic hierarchy: g → broad abilities → narrow abilities (CHC); p → spectra → subfactors → syndromes (HiTOP)
- Moderation edges: SES × heritability (Scarr-Rowe); gender equality × sex differences (GEP); age × heritability (Wilson Effect)
Key structural features for the graph
- Two parallel hierarchies (CHC for cognition, HiTOP for psychopathology) that share genetic correlations at the top level (g correlates with p inversely)
- A developmental cascade from temperament (infancy) through personality (adulthood) through outcomes (mortality, income, relationships), with heritability increasing and shared-environment decreasing across the lifespan
- A methodological funnel from twin estimates (broadest, highest h²) through molecular estimates (narrower, lower h²) through within-family estimates (narrowest, lowest but most causally clean)
- Cross-domain genetic correlations that form a web connecting cognition, personality, and psychopathology — but with the critical caveat that an unknown fraction may be AM artifact rather than biological pleiotropy
Highest-leverage next steps for topology phase
-
Build the trait correlation matrix: Assemble published genetic correlations (from LD Score regression / GWAS) among the ~20–30 most well-characterized traits spanning cognition, personality, and psychopathology. Annotate each with AM-corrected estimates where available. This matrix is the empirical backbone of the topology.
-
Map the developmental cascade: Create a directed graph from temperament → personality → outcomes with age-indexed heritability and stability coefficients as edge weights. This captures the time dimension that a static correlation matrix misses.
-
Formalize the variance decomposition: For each major trait, create a standardized decomposition: [direct genetic] + [genetic nurture/indirect] + [AM-induced] + [shared environment] + [measured non-shared environment] + [stochastic residual]. Where values are unknown, flag them explicitly. This is the generating function skeleton that the formalization phase will flesh out.