Understanding how selection bias impacts our ability to learn from health data
Article Title: A framework for understanding selection bias in real-world healthcare data Authors & Year: R. Kundu, X. Shi, J. Morrison, J. Barrett, and B. Mukherjee (2024)Journal: Journal of the Royal Statistical Society Series A: Statistics in Society [DOI:10.1093/jrsssa/qnae039]Review Prepared by Peter A. Gao Electronic health record (EHR) databases compile hundreds of thousands, or even millions, of patients’ medical histories, enabling researchers to study large populations and observe how their health evolves over time. The databases present an opportunity to identify risk factors for certain diseases, evaluate the efficacy of treatments for people of different backgrounds, and map health disparities. However, individuals are rarely included in such datasets at random, meaning the observed sample may not be representative of the target population. If certain groups are underrepresented in EHR data, using it to measure the prevalence of a condition or to assess the association between a risk factor and a…