For centuries, the test of hypotheses has been one of the fundamental inferential concepts in statistics to guide the scientific community and to confirm one’s belief. The p-value has been a famous and universal metric to reject (or not to reject) a null hypothesis H0, which essentially denotes a common belief even without the experimental data.
It was never meant to last, you know. Statistical measures have their heydays; permanent relevance is no guarantee. The p-value was – and still is – a tool like no other. Through the years it has been caressed and condemned, worshipped and feared, praised and slandered – all the while standing at the crossroads of almost every hypothesis testing, modeling, and prediction. Operationally, a p-value is convenient: we reject, almost mechanically, our null assumption if this value falls below certain discipline-specific thresholds like 0.01, 0.05, etc. Still, its cumbersome construction, triggering its tricky interpretation and stunning misuses, frequently lands it on the wrong side of both practitioners and stats purists. Bodies such as the American Statistical Association routinely issue caution around its use (https://doi.org/10.1080/00031305.2016.1154108). Experts have been hearing its death rattle for quite a while. The article “E-values: calibration, combination, and applications” by V. Volk and R. Wang could be the final twist of the knife. Here, the authors offer a promising alternative – the e-value – which can coexist with – and, at times, replace – its troubled ancestor.