For centuries, the test of hypotheses has been one of the fundamental inferential concepts in statistics to guide the scientific community and to confirm one’s belief. The p-value has been a famous and universal metric to reject (or not to reject) a null hypothesis H0, which essentially denotes a common belief even without the experimental data.
As the fields of statistics and data science have grown, the importance of reproducibility in research and easing the “replication crisis” has become increasingly apparent. The inability to reproduce scientific results when using the same data and code may lead to a lack of confidence in the validity of research and can make it difficult to build on and advance scientific knowledge.
Pinpointing Causality across Time and Geography: Uncovering the Relationship between Airstrikes and Insurgent Violence in Iraq
“Correlation is not causation”, as the saying goes, yet sometimes it can be, if certain assumptions are met. Describing those assumptions and developing methods to estimate causal effects, not just correlations, is the central concern of the causal inference field. Broadly speaking, causal inference seeks to measure the effect of a treatment on an outcome. This treatment can be an actual medicine or something more abstract like a policy. Much of the literature in this space focuses on relatively simple treatments/outcomes and uses data which doesn’t exhibit much dependency. As an example, clinicians often want to measure the effect of a binary treatment (received the drug or not) on a binary outcome (developed the disease or not). The data used to answer such questions is typically patient-level data where the patients are assumed to be independent from each other. To be clear, these simple setups are enormously useful and describe commonplace causal questions.
Large volumes of data are pouring in every day from scientific experiments like CERN and the Sloan Digital Sky Survey. Data is coming in so fast, that researchers struggle to keep pace with the analysis and are increasingly developing automated analysis methods to aid in this herculean task. As a first step, it is now commonplace to perform dimension reduction in order to reduce a large number of measurements to a set of key values that are easier to visualize and interpret.
Have you ever wondered what it’s like to run an insurance company? What role does statistics play in insurance company operations and how can its use be profitable? In this article, we’re going to explore property insurance and a very recent improvement in statistical modeling in this area.
In responding to a pandemic, time is of the essence. As the COVID-19 pandemic has raged on, it has become evident that complex decisions must be made as quickly as possible, and quality data and statistics are necessary to drive the solutions that can prevent mass illness and death. Therefore, it is essential to outline a robust and generalizable statistical process that can not only help to diminish the current COVID-19 pandemic but also assist in the prevention of potential future pandemics.
Companies often want to test the impact of one design decision over another, for example Google might want to compare the current ranking of search results (version A) with an alternative ranking (version B) and evaluate how the modification would affect users’ decisions and click behavior. An experiment to determine this impact on users is known as an A/B test, and many methods have been designed to measure the ‘treatment’ effect of the proposed change. However, these classical methods typically assume that changing one person’s treatment will not affect others (known as the Stable Unit Treatment Value Assumption or SUTVA). In the Google example, this is typically a valid assumption—showing one user different search results shouldn’t impact another user’s click behavior. But in some situations, SUTVA is violated, and new methods must be introduced to properly measure the effect of design changes.
It was never meant to last, you know. Statistical measures have their heydays; permanent relevance is no guarantee. The p-value was – and still is – a tool like no other. Through the years it has been caressed and condemned, worshipped and feared, praised and slandered – all the while standing at the crossroads of almost every hypothesis testing, modeling, and prediction. Operationally, a p-value is convenient: we reject, almost mechanically, our null assumption if this value falls below certain discipline-specific thresholds like 0.01, 0.05, etc. Still, its cumbersome construction, triggering its tricky interpretation and stunning misuses, frequently lands it on the wrong side of both practitioners and stats purists. Bodies such as the American Statistical Association routinely issue caution around its use (https://doi.org/10.1080/00031305.2016.1154108). Experts have been hearing its death rattle for quite a while. The article “E-values: calibration, combination, and applications” by V. Volk and R. Wang could be the final twist of the knife. Here, the authors offer a promising alternative – the e-value – which can coexist with – and, at times, replace – its troubled ancestor.
I recently got back from the Joint Statistical Meetings in Washington, D.C. where I talked about making audiences concrete and motivating authentic arguments for statistics students (and spread the word about MathStatBites of course). This is a big conference where statisticians from all over the world get together to talk shop, and it was back in person after a few years of going virtual.