Archive

Author Archive

Automating web downloads and file unzipping

Andrew J. Dyck wrote a nice post on his blog on how to Download and unzip data files from Stata. He writes

Recently, I’ve been using Stata’s -shp2dta- command to convert some shapefiles to stata format, grabbing Lat/Lon data and merging into another dataset. There were several compressed shapefiles I wanted to download contained in a directory from the web. I could manually download each file and uncompress each one but that would be time consuming. Also, when the maps are updated, I’d have to do the download/uncompress all over again. I’ve found that the process can be automated from within Stata by using a combination of -shell- and some handy terminal commands. …

You should read the rest of his post. He goes on to show how you can script with Stata to automate shelling out to download and unzip a series of files from a website, and he introduces you to some cool Unix-like utilities for Windows.

We here at StataCorp use Stata for tasks like this all the time. In fact, we have built some tools into Stata to allow you to do much of what Andrew described without ever having to leave or shell out of Stata.

For example, Stata can access files over the Internet. Stata has a copy command. And, as of Stata 11, Stata can directly zip and unzip files and directories.

Putting all of those capabilities to use, you can accomplish Andrew’s goal by writing code directly in Stata such as

copy http://example.com/download1.zip download1.zip
copy http://example.com/download2.zip download2.zip
unzipfile download1.zip
unzipfile download2.zip

If there were a large number of files you wished to download and unzip, and they were all named in a regular manner (say, “download1.zip” through “download100.zip”), you could bring them all down and unzip them directly in Stata with a 4 line loop:

forvalues i = 1/100 {
    copy http://example.com/download`i'.zip download`i'.zip
    unzipfile download`i'.zip
}
Categories: Programming Tags: , , , ,

Big computers

We here at Stata are often asked to make recommendations on the “best” computer on which to run Stata, and such discussions sometimes pop up on Statalist. Of course, there is no simple answer, as it depends on the analyses a given user wishes to run, the size of their datasets, and their budget. And, we do not recommend particular computer or operating system vendors. Many manufacturers use similar components in their computers, and the choice of operating system comes down to personal preference of the user. We take pride in making sure Stata works well regardless of operating system and hardware configuration.

For some users, the analyses they wish to run are demanding, the datasets they have are huge, and their budgets are large. For these users, it is useful to know what kind of off-the-shelf hardware they can easily get their hands on. To give you an idea of what is available, HP makes a server with up to 1 TB of memory. Yes, 1 terabyte! This computer can be configured and ordered online at hp.com.

It can have up to 4 processors, each with 8 cores, for a total of 32 cores of processing power. A sample rack-mount configuration with the fastest 8-core Intel Xeon processors available for this computer and a full 1 TB of memory totals roughly $100,000. We mention HP because they were one of the first to allow such large memory configurations without going to a much more expensive completely custom-built solution. Wouldn’t you love to have one of these running Stata/MP (or Halo)?

You can run Windows or Linux on a computer like the above. If you prefer Mac OS X, the largest current configuration from Apple allows a total of 12 cores and 32 GB of memory. This is a tower case unit and costs around $10,000. Visit store.apple.com to configure such a computer.

The largest fastest laptops easily purchased these days allow up to 4 cores and 16 GB of RAM. That much power in a small package will cost you though, with such a configuration costing over $7,000. Here is one such example you can configure from Dell: dell.com.

We’ll keep you updated periodically with the state of the high end of the computer market as memory capacities and number of cores increase.

Categories: Hardware Tags: , , , , , ,