Archive

Archive for 2020

Stata support for Apple Silicon

Apple recently announced that it will be transitioning from Intel processors to its own ARM architecture processors currently being called Apple Silicon. Stata has a long history of supporting Macs, which includes the transitions from Motorola to PowerPC processors, from MacOS Classic to MacOS X, and from PowerPC to Intel processors. We will be working to support the new Macs as they transition from Intel processors to Apple Silicon and will continue our support of Macs with Intel processors as well.

Read more…

Just released from Stata Press: Data Management Using Stata: A Practical Handbook, Second Edition

Stata Press is pleased to announce the release of Data Management Using Stata: A Practical Handbook, Second Edition by Michael N. Mitchell.

Whether you are a new user needing to import, clean, and prepare data for your first analysis in Stata or you are an experienced user hoping to learn new tricks for the most challenging tasks, this book is for you. You can jump straight to the section of the book that discusses the particular challenge you are facing. There you will find a clear explanation of how to approach the problem and illustrative examples to guide you. Read more…

Revealed preference: Stata for reproducible research

I care about reproducible research. Anyone who has ever been a research assistant or tried to follow the path set by other researchers also cares. Sometimes, reproducing others’ results is a frustrating task; sometimes, it is outright impossible. Yet sometimes, it is satisfyingly simple. In my experience, reproducing results is easy when it involves a Stata do-file. I believe this is true even beyond my personal bias (I work for Stata and used the software regularly before that). A recent article published by the American Economic Association (AEA), Vilhuber, Turrito, and Welch (2020), shows that Stata is the preferred package among economists, and I believe reproducibility is a big reason why. Read more…

How to create animated choropleth maps using the COVID-19 data from Johns Hopkins University

In my previous posts, I showed how to download the COVID-19 data from the Johns Hopkins GitHub repository, graph the data over time, and create choropleth maps. Now, I’m going to show you how to create animated choropleth maps to explore the distribution of COVID-19 over time and place.

The video below shows the cumulative number of COVID-19 cases per 100,000 population for each county in the United States from January 22, 2020, through April 5, 2020. The map doesn’t change much until mid-March, when the virus starts to spread faster. Then, we can see when and where people are being infected. You can click on the “Play” icon on the video to play it and click on the icon on the bottom right to view the video in full-screen mode.

Read more…

How to create choropleth maps using the COVID-19 data from Johns Hopkins University

In my last post, we learned how to import the raw COVID-19 data from the Johns Hopkins GitHub repository and convert the raw data to time-series data. This post will demonstrate how to download raw data and create choropleth maps like figure 1.

Figure 1: Confirmed COVID-19 cases in United States adjusted for population size

graph1 Read more…

COVID-19 time-series data from Johns Hopkins University

In my last post, we learned how to import the raw COVID-19 data from the Johns Hopkins GitHub repository. This post will demonstrate how to convert the raw data to time-series data. We’ll also create some tables and graphs along the way. Read more…

Update to Import COVID-19 post

In my last post, I mentioned that I did not want to distribute my covid19.ado file because “it could be rendered useless if or when Johns Hopkins changes its data”. I wrote that on March 19, 2020, and the data changed on March 23, 2020. This will likely happen again (and again, and again …). I may post updates in the future as the data change, but you may need to adapt sooner than I can post. So let’s see how we can update our code to adapt to the changing data. Read more…

Import COVID-19 data from Johns Hopkins University

Like many of you, I am working from home and checking the latest news on COVID-19 frequently. I see a lot of numbers and graphs, so I looked around for the “official data”. One of the best data sources I have found is at the GitHub website for Johns Hopkins Whiting School of Engineering Center for Systems Science and Engineering. The data for each day are stored in a separate file, so I wrote a little Stata command called covid19 to download, combine, save, and graph these data. Read more…

Just released from Stata Press: Introduction to Time Series Using Stata, Revised Edition

Stata Press is pleased to announce the release of Introduction to Time Series Using Stata, Revised Edition, by Sean Becketti. This edition has been updated for Stata 16 and is available in paperback, eBook, and Kindle format. In this book, Becketti introduces time-series techniques—from simple to complex—and explains how to implement them using Stata. The many worked examples, concise explanations that focus on intuition, and useful tips based on the author’s experience make the book insightful for students, academic researchers, and practitioners in industry and government. Read more…

Bayesian inference using multiple Markov chains

Overview

Markov chain Monte Carlo (MCMC) is the principal tool for performing Bayesian inference. MCMC is a stochastic procedure that utilizes Markov chains simulated from the posterior distribution of model parameters to compute posterior summaries and make predictions. Given its stochastic nature and dependence on initial values, verifying Markov chain convergence can be difficult—visual inspection of the trace and autocorrelation plots are often used. A more formal method for checking convergence relies on simulating and comparing results from multiple Markov chains; see, for example, Gelman and Rubin (1992) and Gelman et al. (2013). Using multiple chains, rather than a single chain, makes diagnosing convergence easier.

As of Stata 16, bayesmh and its bayes prefix commands support a new option, nchains(), for simulating multiple Markov chains. There is also a new convergence diagnostic command, bayesstats grubin. All Bayesian postestimation commands now support multiple chains. In this blog post, I show you how to check MCMC convergence and improve your Bayesian inference using multiple chains through a series of examples. I also show you how to speed up your sampling by running multiple Markov chains in parallel. Read more…