Blog

Large volumes of data are pouring in every day from scientific experiments like CERN and the Sloan Digital Sky Survey. Data is coming in so fast, that researchers struggle to keep pace with the analysis and are increasingly developing automated analysis methods to aid in this herculean task. As a first step, it is now commonplace to perform dimension reduction in order to reduce a large number of measurements to a set of key values that are easier to visualize and interpret.

Read more

For statistical modeling and analyses, construction of a confidence interval for a parameter of interest is an important inferential task to quantify the uncertainty around the parameter estimate. For instance, the true average lifetime of a cell phone can be a parameter of interest, which is unknown to both manufacturers and consumers. Its confidence interval can guide the manufacturers to determine an appropriate warranty period as well as to communicate the device reliability and quality to consumers. Unfortunately, exact methods to build confidence intervals are often unavailable in practice and approximate procedures are employed instead.

Read more

Have you ever wondered what it’s like to run an insurance company? What role does statistics play in insurance company operations and how can its use be profitable? In this article, we’re going to explore property insurance and a very recent improvement in statistical modeling in this area.

Read more

Can you imagine lying really, really still for at least 15 minutes? That is the reality of patients who need to complete a magnetic resonance imaging (MRI) scan. Even if you could keep still for that long, a scan could take up to 15 – 90 minutes! Patients need to lie as still as possible so that the MRI machine can capture images used to detect and diagnose diseases. Even the tiniest patient movement can distort the final image that is returned. 

Read more

How many species in our ecosystem have not been discovered? How many words did Williams Shakespeare know but not include in his written works? The unseen species problem has applications in both sciences and humanities, and it has been studied since the 1940s. This classical problem is recently generalized to the unseen features problem. In genomic applications, a feature is a genetic variant compared to a reference genome, and the scientific goal is to estimate the number of new genetic variants to be observed if we were to collect more samples.

Read more

If a solid object floats in water in every position, is it necessarily a sphere? In a paper published this year in the Annals of Mathematics, Dmitry Ryabogin proves the answer is “no”. 

Read more

Some of the hardest questions to answer in math are the simplest to state. For example “when does a sequence of numbers $a_1, a_2, a_3, \ldots$ have the property that $a_{i}^2 \geq a_{i-1}a_{i+1}?” A sequence having this property is called “log-concave”. To get familiar with log-concavity, let’s consider the most famous log-concave sequence: the sequence found by specifying a row of Pascal’s triangle

Read more

In responding to a pandemic, time is of the essence. As the COVID-19 pandemic has raged on, it has become evident that complex decisions must be made as quickly as possible, and quality data and statistics are necessary to drive the solutions that can prevent mass illness and death. Therefore, it is essential to outline a robust and generalizable statistical process that can not only help to diminish the current COVID-19 pandemic but also assist in the prevention of potential future pandemics. 

Read more

Companies often want to test the impact of one design decision over another, for example Google might want to compare the current ranking of search results (version A) with an alternative ranking (version B) and evaluate how the modification would affect users’ decisions and click behavior. An experiment to determine this impact on users is known as an A/B test, and many methods have been designed to measure the ‘treatment’ effect of the proposed change.  However, these classical methods typically assume that changing one person’s treatment will not affect others (known as the Stable Unit Treatment Value Assumption or SUTVA). In the Google example, this is typically a valid assumption—showing one user different search results shouldn’t impact another user’s click behavior. But in some situations, SUTVA is violated, and new methods must be introduced to properly measure the effect of design changes. 

Read more

40/46