April 2011 - The Stata Blog

Merging data, part 1: Merges gone bad

18 April 2011 William Gould, President Emeritus 6 comments

Merging concerns combining datasets on the same observations to produce a result with more variables. We will call the datasets one.dta and two.dta.

When it comes to combining datasets, the alternative to merging is appending, which is combining datasets on the same variables to produce a result with more observations. Appending datasets is not the subject for today. But just to fix ideas, appending looks like this: Read more…

Categories: Data Management Tags: append, merge

Multiprocessor (core) software (think Stata/MP) and percent parallelization

7 April 2011 William Gould, President Emeritus 7 comments

When most people first think about software designed to run on multiple cores such as Stata/MP, they think to themselves, two cores, twice as fast; four cores, four times as fast. They appreciate that reality will somehow intrude so that two cores won’t really be twice as fast as one, but they imagine the intrusion is something like friction and nothing that an intelligently placed drop of oil can’t improve.

In fact, something inherent intrudes. In any process to accomplish something—even physical processes—some parts may be able to to be performed in parallel, but there are invariably parts that just have to be performed one after the other. Anyone who cooks knows that you sometimes add some ingredients, cook a bit, and then add others, and cook some more. So it is, too, with calculating x_t = f(x_t-1) for t=1 to 100 and t₀=1. Depending on the form of f(), sometimes there’s no alternative to calculating x₁ = f(x₀), then calculating x₂ = f(x₁), and so on. Read more…

Categories: Multiprocessing Tags: Amdahl's Law, multiprocessing, parallel, performance, Stata/MP

Archive

Merging data, part 1: Merges gone bad

Multiprocessor (core) software (think Stata/MP) and percent parallelization

Subscribe to the Stata Blog

Recent articles

Archives

Categories

Links