2025 - The Stata Blog

A new update to StataNow has just been released

13 August 2025 Kristin MacDonald, Executive Director, Statistical Services No comments

A new update to StataNow has just been released. With new statistical features and interface improvements, there is something for everyone. We are excited to share the new features with you. Read more…

Categories: New Products, Stata Products Tags: frames, interface, LATE, robust standard errors

Looking ahead to the 2025 Stata Conference: A celebration of data, discovery, and 40 years of Stata

28 July 2025 Ashley Schnell, Director, Product Marketing No comments

Excitement is building for the 2025 Stata Conference, where researchers, analysts, and data scientists from around the world will come together to share ideas, showcase their work, and explore the frontiers of statistical analysis using Stata.

Set to take place in Nashville, TN, on 31 July–1 August, this year’s conference will once again highlight the creativity and rigor that define the Stata user community. From clever programming tips to pioneering research, the event promises a wide range of presentations that reflect the diversity of disciplines and applications where Stata is making an impact. Read more…

Categories: Meetings Tags: conference, meetings, Nashville, users, users group

Prediction intervals with gradient boosting machine

20 May 2025 Aramayis Dallakyan, Senior Statistician and Software Developer No comments

Introduction
Machine learning methods, such as ensemble decision trees, are widely used to predict outcomes based on data. However, these methods often focus on providing point predictions, which limits their ability to quantify prediction uncertainty. In many applications, such as healthcare and finance, the goal is not only to predict accurately but also to assess the reliability of those predictions. Prediction intervals, which provide lower and upper bounds such that the true response lies within them with high probability, are a reliable tool for quantifying prediction accuracy. An ideal prediction interval should meet several criteria: it should offer valid coverage (defined below) without relying on strong distributional assumptions, be informative by being as narrow as possible for each observation, and be adaptive—provide wider intervals for observations that are “difficult” to predict and narrower intervals for “easy” ones. Read more…

Categories: Statistics Tags: conformal intervals, GBM, machine learning, prediction intervals, quantile regression

Approximate statistical tests for comparing binary classifier error rates using H2OML

22 April 2025 Houssein Assaad, Associate Director, Statistics and Aramayis Dallakyan, Senior Statistician and Software Developer No comments

Motivation

You have just trained a gradient boosting machine (GBM) and a random forest (RF) classifier on your data using Stata’s new h2oml command suite. Your GBM model achieves 87% accuracy on the testing data, and your RF model, 85%. It looks as if GBM is the preferred classifier, right? Not so fast.

Why accuracy alone isn’t enough

Accuracy, area under the curve, and root mean squared error are popular metrics, but they provide only point estimates. These numbers reflect how well a model performed on one specific testing sample, but they don’t account for the variability that can arise from sample to sample. In other words, they don’t answer this key question: Will the difference in performance between these methods hold at the population level, or could it have occurred by chance only in this particular testing dataset? Read more…

Categories: Statistics Tags: ensemble trees classification, H2O, machine learning, statistical tests

Stata 19 is released!

8 April 2025 Alan Riley, President No comments

I am excited to let you be the second to know that Stata 19 is now available. Statalist is always the first to know!

Highlights include

And more. Visit stata.com/new-in-stata for all the details. You can also visit stata.com/help.cgi?whatsnew18to19 for the nitty gritty on every single change from Stata 18 to Stata 19.

Those of you with StataNow already received some of these features along the way in updates to StataNow. And, those of you with StataNow are eligible for an automatic upgrade to StataNow 19. Watch your inbox for an email from us with instructions on how to request your upgrade.

Categories: New Products Tags: Bayesian, biostatistics, CATE, cox, CRE, data science, econometrics, H2O, HDFE, machine learning, meta-analysis, mundlak, new release

Archive

A new update to StataNow has just been released

Looking ahead to the 2025 Stata Conference: A celebration of data, discovery, and 40 years of Stata

Prediction intervals with gradient boosting machine

Approximate statistical tests for comparing binary classifier error rates using H2OML

Stata 19 is released!

Subscribe to the Stata Blog

Recent articles

Archives

Categories

Links