Moinak Bhaduri – MathStat Bites

March 18, 2025 by Moinak Bhaduri

Directional weirdness: when statistical depth runs out of depth

Article: Directional Outlyingness for Multivariate Functional DataAuthors and Year: Wenlin Dai & Marc G. Genton 2019Journal: Computational Statistics & Data Analysis Review Prepared by: Moinak BhaduriDepartment of Mathematical Science, Bentley University, Massachusetts Outliers are individuals and entities to whom we have forever turned with awe and skepticism, with curiosity and suspicion, with expectation and anxiety. They do not fit the norm and are, oftentimes, for better or for worse, risky to ignore. Malcolm Gladwell in his book Outliers: The story of Success samples our society and brings out such remarkable individuals and examines commonalities: what thread binds them, what makes them deviate from the crowd. And just as a common tendency is difficult to pinpoint while investigating these people – some were forced to the extremes by social pressure or hardships, while some others are propelled by sheer curiosity – statistical data which are more complex than simple numbers, may…

October 28, 2024 by Moinak Bhaduri

A promising way to disentangle time from space kicks off

Review Prepared by: Moinak Bhaduri Mathematical Sciences, Bentley University, Massachusetts Fine! I admit it! The title’s a bit click-baity. “Time” here need not be some immense galactic time. “Space” refers here not to the endless physical or literal space around you, but more to the types of certain events. But once you realize why the untangling was vital, how it is achieved in games such as soccer, and what forecasting benefits it can lead to, you’ll forgive me. You see, for far too long, whenever scientists had to model (meaning describe and potentially, forecast) phenomena that had both a time and a value component, such as the timing of earthquakes and magnitude of those shocks, or times of gang violence and casualties because of those attacks, their default go-to were typical spatio-temporal processes such as the marked Hawkes (described below). While with that reliance no fault may be found in…

football forecasting analysis modeling soccer space space-time spatio-temporal process statistics time

February 27, 2024 by Moinak Bhaduri

“Changes” in statistics, “changes” in computer science, changes in outlook

No matter how free interactions become, tribalism remains a basic trait. The impulse to form groups based on similarities of habits – of ways of thinking, the tendency to congregate across disciplinary divides, never goes away fully regardless of how progressive our outlook gets. While that tendency to form cults is not problematic in itself (there is even something called community detection in network science that exploits – and exploits to great effects – this tendency) when it morphs into animosity, into tensions, things get especially tragic. The issue that needs to be solved gets bypassed, instead noise around these silly fights come to the fore. For example, the main task at hand could be designing a drug that is effective against a disease, but the trouble may lie in the choice of the benchmark against which this fresh drug must be pitted. In popular media, that benchmark may be the placebo – an inconsequential sugar pill, while in more objective science it could be the drug that is currently in use. There are instances everywhere of how scientists and journalists come in each other’s way (Ben Goldacre’s book Bad Science imparts crucial insights) or how even among scientists, factionalism persists: how statisticians – even to this day – prefer to be classed as frequentists or Bayesians, or how even among Bayesians, whether someone is an empirical Bayesian or not. The sad chain never stops. You may have thought of this tendency and its result. How it is promise betrayed, collaboration throttled in the moment of blossoming. While the core cause behind that scant tolerance, behind that clinging on to, may be a deep passion for what one does, the problem at hand pays little regard to that dedication. The problem’s outlook stays ultimately pragmatic: it just needs solving. By whatever tools. From whatever fields. Alarmingly, the segregations or subdivisions we sampled above and the differences they lead to – convenient though they may be – do not always remain academic: distant to the point of staying irrelevant. At times, they deliver chills much closer to the bone: whether a pure or applied mathematician will get hired or promoted, how getting published in computer science journals should be – according to many – more frequent compared to those in mainstream statistics, etc.

change point Cumulative Sum cusum neural network statistics

September 5, 2023 by Moinak Bhaduri

Conform – or else! Conformal scores as tools to lay out a set of likely classification labels

The setting repeats depressingly often. A hurricane inching towards the Florida coast. Weather scientists glued to tracking monitors, hunched over simulation printouts, trying to remove people out of harm’s way. Urgent to them is the need to mark a patch of the shore where the hurricane is likely to hit. Those living in this patch need to be relocated. These scientists, and many before them – it’s hard to say since when – realized what’s at issue here is not quite so much the precise location the storm is going to hit – precise to the exact grain of sand, but a stretch of land (whose length may shrink gradually depending on how late we leave the forecasting) where it is going to affect people with a high chance. A forecast interval of sorts.

classification conformal predictions coverage prediction sets

April 18, 2023 by Moinak Bhaduri

On swallowing shrewd marketing baits: A silent salute to demand evolution

To John, enticements never can exert a pull. Probably the product of a disciplined upbringing. When John wants to buy something, he knows exactly what he’s looking for. He gets in and he gets out. No dilly-dallying, no pointless scrolling. Few of us are like John; the rest secretly aspire to be. Go on. Admit it! The science of enticing customers is sustained by this weakness.

complementarity confidence data science demand economics lifts market basket analysis recommendation systems statistics utility

January 24, 2023 by Moinak Bhaduri

Explainable groupings in the face of noisy, high-dimensional madness: Wild ambitions tamed through features’ salience

Whatever your exact interests in data, frequently, inseparable from model-building, stand other related responsibilities. Sample two crucial ones:

a. the checking of how well your model did: the less frequently you make big, bad decisions – like predicting someone’s salary to be $95,000, an estimate far adrift from the real, say, $70,000 in case it’s a regression problem, or saying a customer will buy a product when, in fact, she won’t, under a classification environment – the happier you are. These accuracies are unsurprisingly, often used to guide the model-building process.

b. the explaining of how you arrived at a prediction: this involves unpacking or interpreting the $95,000. The person, due to his experience, makes $10,000 more than the average, due to his education, makes $20,000 more, but due to his state of residence, makes $5000 less than the average, etc. These ups and downs contribute to a net final value.

contrasts feature salience high-dimensional clustering interpretable machine learning predominance recall unsupervised clustering

October 4, 2022 by Moinak Bhaduri

E-values in statistics: apt additions or instruments of generational revolt?

It was never meant to last, you know. Statistical measures have their heydays; permanent relevance is no guarantee. The p-value was – and still is – a tool like no other. Through the years it has been caressed and condemned, worshipped and feared, praised and slandered – all the while standing at the crossroads of almost every hypothesis testing, modeling, and prediction. Operationally, a p-value is convenient: we reject, almost mechanically, our null assumption if this value falls below certain discipline-specific thresholds like 0.01, 0.05, etc. Still, its cumbersome construction, triggering its tricky interpretation and stunning misuses, frequently lands it on the wrong side of both practitioners and stats purists. Bodies such as the American Statistical Association routinely issue caution around its use (https://doi.org/10.1080/00031305.2016.1154108). Experts have been hearing its death rattle for quite a while. The article “E-values: calibration, combination, and applications” by V. Volk and R. Wang could be the final twist of the knife. Here, the authors offer a promising alternative – the e-value – which can coexist with – and, at times, replace – its troubled ancestor.

Calibration combination e-value hypothesis testing p-value statistics

7/7