The meteorologists of today no longer ask themselves, “Will it rain tomorrow?”, but rather, “What is the probability it will rain tomorrow?”. In other words, weather forecasting has evolved beyond giving simple point projections, and instead has largely shifted to probabilistic predictions, where forecast uncertainty is quantified through quantiles or entire probability distributions. Probabilistic forecasting was also the subject of my previous blog post, where the article of discussion explored the intricacies of proper scoring rules, metrics that allow us to compare and rank these more complex distributional forecasts. In this blog post, we explore facets of an even more basic consideration: how can one be sure their probabilistic forecasts make sense and actually align with the data that ended up being observed? This ‘alignment’ between forecasted probabilities and observations is referred to as probabilistic calibration. Put more concretely, when a precipitation forecasting model gives an 80% chance of rain, one would expect to see rain in approximately 80% of those cases (if the model is calibrated).
Machine learning models are excellent at discovering patterns in data to make predictions. However, their insights are limited to the input data itself. What if we could provide additional knowledge about the model features to improve learning? For example, suppose we have prior knowledge that certain features are more important than others in predicting the target variable. Researchers have developed a new method called the feature-weighted elastic net (“fwelnet”) that integrates this extra feature knowledge to train smarter models, resulting in more accurate predictions than regular techniques.