Replication Crisis - 😊 MrDingMaths

## Key Ideas > [!abstract] Core concepts > > - **Irreproducibility across fields**: A large proportion of influential studies fail replication attempts in multiple scientific disciplines > - **Begley's landmark study**: Only 6 of 53 medical studies could be reproduced, revealing systemic research problems > - **Educational psychology affected**: Social sciences experience high rates of replication failure due to methodological issues and publication pressures ## Definition The **replication crisis** refers to the failure to reproduce published research findings across multiple scientific fields. This pattern of non-replication affects the reliability of scientific literature, particularly in social and educational research. ## Connected to [[What Research can you Trust]] | [[Correlation does not equal causation]] | [[Logical Fallacies]] --- ## Scale of the problem > [!info] Begley (2012) > Biotech consultant Glenn Begley attempted to reproduce 53 landmark medical studies from top journals before using findings for drug development. Only 6 of 53 studies could be replicated. When confronted, one scientist admitted conducting the experiment six times and only reporting the favourable result. This 11% replication rate indicates a problem of scientific credibility. Replication verifies that findings are reliable across different settings, times, and researchers. Educational research experiences additional challenges due to the complexity of classroom environments and difficulty controlling confounding variables. ## Causes of the crisis The replication crisis results from incentive structures that reward novel findings over verification, combined with methodological weaknesses and insufficient transparency. Academic career structures create pressure to publish novel, positive results. Grants reward discovering new findings rather than verifying existing research. Journals preferentially publish positive results, which incentivises researchers to manipulate data until achieving favourable outcomes. Career advancement requires publications in high-impact journals, which favour statistically significant positive findings. These pressures interact with methodological problems to produce unreliable findings. Common issues include flawed experimental designs, insufficient statistical power, and p-hacking (manipulating data or analyses to achieve statistical significance). Researchers often conduct multiple analyses and report only favourable results. In educational psychology, p-values often fall close to the 0.05 significance threshold. Small changes in analytical approach or sample composition can shift results from "significant" to "non-significant." ## Implications for education The replication crisis requires scepticism towards individual studies and preference for converging evidence from multiple independent sources. Educators need skills in research evaluation (see [[What Research can you Trust]]) to distinguish reliable findings from unreliable ones. Educational interventions should have independent verification through multiple replications before implementation. Studies with multiple successful replications provide stronger evidence than single groundbreaking findings. ## The connectivity principle Even when direct empirical evidence for a particular educational practice is lacking, the connectivity principle provides a way to evaluate its plausibility (Stanovich & Stanovich, 2003). The question to ask is: "How is the theory behind this method connected to the research consensus in the literature surrounding this curriculum area?" For example, consider two hypothetical treatments for children with extreme reading difficulties, neither of which has been directly tested. Treatment A involves training programme to facilitate awareness of the segmental nature of language at the phonological level. Treatment B involves giving children training in vestibular sensitivity by having them walk on balance beams whilst blindfolded. Both treatments are equal in that neither has had a direct empirical test of efficacy. However, Treatment A has the edge when it comes to connectivity. Treatment A makes contact with a broad consensus in the research literature that children with extraordinary reading difficulties are hampered because of insufficiently developed awareness of the segmental structure of language. Treatment B is not connected to any corresponding research literature consensus. Reason dictates that Treatment A is a better choice, even though neither has been directly tested. ### Meta-analysis as solution Meta-analysis has emerged as a powerful tool for addressing the replication crisis (Stanovich & Stanovich, 2003). By combining data from multiple studies, meta-analysis creates pools large enough to eliminate much of the statistical uncertainty that plagues individual trials. Clear findings can emerge from a group of studies whose findings are scattered all over the map. The emphasis on meta-analysis has often revealed that we actually have more stable and useful findings than is apparent from a perusal of conflicts in journals. Meta-analysis is useful for ending disputes that seem to be nothing more than a "he-said, she-said" debate, providing a way of dampening the contentious disputes about conflicting studies that plague education and other behavioural sciences. ## References Baker, M. (2016). 1,500 scientists lift the lid on reproducibility. *Nature*, 533(7604), 452-454. https://doi.org/10.1038/533452a Begley, C. G., & Ellis, L. M. (2012). Raise standards for preclinical cancer research. *Nature*, 483(7391), 531-533. https://doi.org/10.1038/483531a Head, M. L., Holman, L., Lanfear, R., Kahn, A. T., & Jennions, M. D. (2015). The extent and consequences of p-hacking in science. *PLOS Biology*, 13(3), e1002106. https://doi.org/10.1371/journal.pbio.1002106 Ioannidis, J. P. A. (2005). Why most published research findings are false. *PLOS Medicine*, 2(8), e124. https://doi.org/10.1371/journal.pmed.0020124 Makel, M. C., Plucker, J. A., & Hegarty, B. (2012). Replications in psychology research: How often do they really occur? *Perspectives on Psychological Science*, 7(6), 537-542. https://doi.org/10.1177/1745691612460688 Open Science Collaboration. (2015). Estimating the reproducibility of psychological science. *Science*, 349(6251), aac4716. https://doi.org/10.1126/science.aac4716 Simmons, J. P., Nelson, L. D., & Simonsohn, U. (2011). False-positive psychology: Undisclosed flexibility in data collection and analysis allows presenting anything as significant. *Psychological Science*, 22(11), 1359-1366. https://doi.org/10.1177/0956797611417632 Stanovich, P. J., & Stanovich, K. E. (2003). *Using research and reason in education*.