Archive

Archive for the ‘Resources’ Category

Update on the Stata YouTube Channel

What is it about round numbers that compels us to pause and reflect? We celebrate 20-year school reunions, 25-year wedding anniversaries, 50th birthdays and other similar milestones. I don’t know the answer but the Stata YouTube Channel recently passed several milestones – more than 1500 subscribers, over 50,000 video views and it was launched six months ago. We felt the need for a small celebration to mark the occasion, and I thought that I would give you a brief update.

I could tell you about re-recording the original 24 videos with a larger font to make them easier to read. I could tell you about the hardware and software that we use to record them including our experiments with various condenser and dynamic microphones. I could share quotes from some of the nice messages we’ve received. But I think it would be more fun to talk about….you!

YouTube collects data about the number of views each video receives as well as summary data about who, what, when, where, and how you are watching them. There is no need to be concerned about your privacy; there are no personal identifiers of any kind associated with these data. But the summary data are interesting, and I thought it might be fun to share some of the data with you.

Who’s watching?

Figure1

Figure 1 shows the age distribution of Stata YouTube Channel viewers. If you have ever attended a Stata Conference, you will not be surprised by this graph…until you notice the age group at the bottom. I would not have guessed that 13-17 year olds are watching our videos. Perhaps they saw Stata in the movie “Moneyball” with Brad Pitt and wanted to learn more. Or maybe they were influenced by the latest fashion craze sweeping the youth of the world.

What are you watching?

Figure2

We have posted more than 50 videos over a wide range of topics. Figure 2 shows the total number of views for the ten most popular videos. The more popular of the ten are about broad topics. These broader videos are mostly older and have thus had time to accumulate more views.

Even so, these videos receive more views per day currently than do the special topic videos that have been posted more recently. This supports my belief that Stata YouTube Channel viewers tend to be relatively new Stata users who want to learn about general topics, and that means more generic videos in the future. So you and your two post-docs will just have to read the manual if you want to learn how to fit asymmetric power ARCH models with outer-product gradient standard errors.

When are you watching?

Figure3

We usually post new videos on Tuesday mornings which might lead you to believe that the peak viewing day would also be Tuesday. Figure 3, however, shows us that the average number of views per day (vpd) is higher on Wednesdays at 420 vpd and in fact peaks on Thursdays at 430 vpd before declining Friday through Sunday.

Figure4

Figure 4 also shows us that late September may have been not the best time to launch the Stata YouTube Channel. Our early momentum in September and October slowed during the November and December holiday seasons. We were, however, pleased to see that 49 of you spent New Years Eve watching our videos. Perhaps next year we’ll prepare something more festive just for you!

Where are you watching?

What do the Czech Republic, Pakistan, Uganda, Madagascar, the United Kingdom, the Bahamas, the United States, Montenegro, and Italy have in common? Correct! They are all countries in which you are watching our videos. They are also locations depicted in one of my favorite action films but I’ll leave that to the trivia buffs. I think the most exciting information that we found in our data is that the Stata YouTube Channel is being viewed in 164 countries!

Figure5

You might not be surprised to learn that roughly half of the people watching the videos live in the United States, the United Kingdom, or Canada. The results may be unexpected when we consider the “view rate” defined as the number of views per 100,000 residents. Figure 5 shows the top 20 countries ranked by view rate for countries with at least four million residents. Denmark had the highest view rate which was nearly twice the rate of Norway which had the second highest view rate. The view rate in Denmark was more than three times the rate in the US and the UK.

How are you watching?

You might think that I would have anything to report about “how” you are watching the videos, but it turns out that 5.2% of you are watching on mobile devices. Perhaps this explains the 13-17 year old demographic or the 49 people watching on New Year’s Eve. Or maybe we are helping you pass the time in the dentist office waiting room.

Final thoughts

Six months isn’t much of a milestone. We Stata folk will use any excuse to break out the cake and ice cream. Even so, the Stata YouTube Channel began as an experiment and often experiments do not work out as we would like. This experiment has exceeded our expectations and, as a result, we have started taking requests for videos on our Facebook page and we’ll be adding more videos every week. So thanks for watching and stay tuned!

Now if you will excuse me, I’m going to get some cake and ice cream.

Categories: Resources Tags: ,

2011 Stata Conference recap

The 2011 Stata Conference in Chicago ended last Friday, and a good time was had by all.

The two days had the usual wide array of talks, given by researchers in Econometrics, Sociology, Medicine, and Statistics, together with three of us from StataCorp—Bill Gould, David Drukker, and me.

The conference was held in the Gleacher center on the banks of the Chicago River in Chicago (of course), which is a fine facility. I know it sounds mundane, but the acoustics in the lecture hall were excellent, making it very easy for speakers and questions to be heard clearly.

It was really fun talking to old friends and making new ones both during the breaks and the conference dinner on Thursday night.

The Wishes and Grumbles session was one of the liveliest in recent memory. These are always fun for us, because they give us a window on design questions in Stata. The extra buzz from Stata 12 being recently announced was an added bonus.

Chris and Gretchen Farrar, who were running the logistics for the meeting said this was one of the happiest groups they can remember.

Here are the sentiments of Gabi Huiber, who tweeted:

Back from @Stata Conference, telling my wife about it. Her: “You’re glowing. That must have been like a spa retreat for you.

I couldn’t have said it better.

A gallery of photos from the conference is available on Facebook.

The 2012 Stata Conference will be in San Diego on July 26 and 27. See you there!

Categories: Meetings Tags: ,

Stata at JSM 2011 in Miami Beach, FL

StataCorp invites you to stop by our booth, 404, at JSM 2011, July 31 – August 3, in Miami Beach, FL. StataCorp staff and developers will be on hand to answer any questions you have about Stata, from statistics to programming to licensing. You can also register to win a copy of quad-core Stata/MP.

StataCorp is also presenting three continuing education technology workshops at JSM 2011:

Survey Data Analysis with Stata
Jeffrey Pitblado, Associate Director, Statistical Software
Wednesday, August 3, 8:00 AM – 9:45 AM
Register for Activity Number CE_24T

Multiple Imputation Analysis in Stata
Yulia Marchenko, Associate Director, Biostatistics
Wednesday, August 3, 10:00 AM – 11:45 AM
Register for Activity Number CE_28T

Multilevel and Mixed Models in Stata
Bill Rising, Director of Educational Services
Wednesday, August 3, 1:00 PM – 2:45 PM
Register for Activity Number CE_32T

To register for the workshops, sign up when you register to attend JSM or go to http://www.amstat.org/meetings/jsm/2011/onlineprogram/.

We look forward to seeing you in Miami Beach. Be sure to stop by booth 404 to learn more about Stata or just to visit with the people who make it.

Categories: Meetings Tags: , ,

Stata Conferences and Meetings Update

Between now and the end of the year, the annual Stata Conference in the United States will take place along with five other Stata meetings in countries around the world.

Stata conferences and meetings feature talks by both Stata users and Stata developers and provide an opportunity to help shape the future of Stata development by interacting with and providing feedback directly to StataCorp personnel.

The talks range from longer presentations by invited speakers to shorter talks demonstrating the use of Stata in a variety of fields. Some talks are statistical in nature while others focus on data management, graphics, or programming in Stata. New enhancements to Stata created both by users and by StataCorp are often featured in talks.

The full schedule of upcoming meetings is

2011 Mexican Stata Users Group meeting
May 12, 2011

2011 German Stata Users Group meeting
July 1, 2011

Stata Conference Chicago 2011
July 14–15, 2011

2011 UK Stata Users Group meeting
September 15–16, 2011

2011 Spanish Stata Users Group meeting
September 22, 2011

2011 Nordic and Baltic Stata Users Group meeting
November 11, 2011

Click on any meeting title for more information, including programs and registration information.

Categories: Meetings Tags: ,

Stata Conference Chicago 2011 Call for Presentations

The 2011 Stata Conference will be held on July 14 and 15 at the University of Chicago’s Gleacher Center. I’ve enjoyed meeting many enthusiastic Stata users at previous Stata Conferences, and I’m looking forward to seeing both familiar and new faces this year in Chicago.

The organizing committee recently posted a call for presentations on Statalist. That posting is included below.

To submit an abstract for a presentation, or to register for the conference, visit the conference webpage.

From: Phil Schumm <pschumm@uchicago.edu>
To: Statalist <statalist@hsphsun2.harvard.edu>
Subject: st: Stata Conference 2011 in Chicago
Date: Wed, 15 Dec 2010 08:21:32 -0500

On behalf of the organizing committee, I would like to invite everyone to participate in the Stata Conference 2011, to be held July 14-15th in Chicago. The meeting will be held in the University of Chicago’s Gleacher Center, right on the Chicago river in the heart of downtown. Chicago is a great place to visit in the summer, and the location of the conference will make it easy to take advantage of all the city has to offer.

Below is the call for presentations. This year’s organizing committee consists of Lisa Barrow, Scott Long, Richard Williams, and myself. Please contact one of us if you would like to discuss an idea for a presentation or have questions about the program format. For those of you who have not attended a Stata users group meeting before, giving a presentation is a great opportunity to share what you are doing in Stata with others, and to get feedback from interested (and knowledgeable) users and from StataCorp developers. And, as an added bonus, if your abstract is accepted for presentation, the conference registration fee will be waived (presenting author only).

We look forward to seeing everyone in Chicago!

– Phil

    Announcement and call for presentations

The Stata Conference 2011 will be held at the University of Chicago Graduate School of Business’ Gleacher Center. The Gleacher Center is located downtown on the bank of the Chicago River, just steps from Michigan Avenue and within walking distance of most downtown attractions.

Stata users’ meetings are enjoyable and rewarding for Stata users at all levels and from all disciplines. This year’s program will consist of a mixture of user presentations, longer talks by invited presenters, and talks by StataCorp developers. In addition, the program will include the ever-popular “Wishes and Grumbles” session in which users have an opportunity to share their comments and suggestions directly with developers from StataCorp.

All users are encouraged to submit abstracts for possible presentations. Presentations on any Stata-related topic will be considered, including (but not limited to) the following:

  • new user-written commands, including commands for modeling and estimation, graphical analysis, data management or reporting
  • use or evaluation of existing Stata commands
  • methods for teaching statistics with Stata or Stata use itself
  • case studies of Stata use in novel areas or applications
  • surveys or critiques of Stata facilities in specific fields
  • comparisons of Stata to other software, or use of Stata together with other software

User presentations should be either 15 or 25 minutes long, each followed by 5 minutes for questions. Longer talks will be at the discretion of the scientific committee.

    Submission guidelines

Please submit an abstract of no more than 200 words (ASCII text, no math symbols) by using the web submission form at http://repec.org/chi11/chi11.php. All abstracts must be received by March 14, 2011. Please make sure to include a short, informative title, and to indicate whether you wish to be considered for a short (15-minute) or long (25-minute) presentation. In addition, if your presentation has multiple authors, please identify the presenter. The conference registration fee will be waived for the presenter.

If you would like to discuss an idea for a presentation or have questions about the program format, please contact a member of the scientific organizing committee. This year’s committee consists of

Lisa Barrow (Federal Reserve Bank of Chicago) <lbarrow@frbchi.org>
Scott Long (Indiana University) <jslong@indiana.edu>
Phil Schumm (University of Chicago) <pschumm@uchicago.edu>
Rich Williams (Notre Dame) <richard.a.williams.5@nd.edu>

Presenters will be asked to provide electronic materials related to their talk (a copy of the presentation and any programs/datasets, where applicable) to the organizers so that the materials can be posted on the StataCorp website and in the Stata Users Group RePEc archive.

How to successfully ask a question on Statalist

As everyone knows, I am a big proponent of Statalist, and not just for selfish reasons, although those reasons play a role. Nearly every member of the technical staff at StataCorp — me included — are members of Statalist. Even when we don’t participate in a particular thread, we do pay attention. The discussions on Statalist play an important role concerning Stata’s development.

Statalist is a discussion group, not just a question-and-answer forum. Nonetheless, new members often use it to obtain answers to questions and that works because those questions sometimes become gist for subsequent discussions. In those cases, the questioners not only get answers, they get much more.

One of the best features of Statalist is that, no matter how poorly you ask a question, you are unlikely to be flamed. Not only are the members of Statalist nice — just as are the members of most lists — they act just as nice on the list as they really are. You are unlikely to be flamed if you ask a question poorly, but you are also unlikely to get an answer.

Here is my recipe to increase the chances of you getting a helpful response. You should also read the Statalist FAQ before writing your question.

Subject line

Make the subject line of your email meaningful. Some good subject lines are:

Survival analysis

Confusion about -stcox-

Unexpected error from -stcox-

-stcox- output

The first two sentences

The first two sentences are the most important, and they are the easiest to write.

In the first sentence, state your problem in Stata terms, but do not go into details. Here are some good first sentences:

I’m having a problem with -stcox-.

I’m getting an unexpected error message from -stcox-.

I’m using -stcox- and don’t know how to interpret the result.

I’m using -stcox- and getting a result I know is wrong, so I know I’m misunderstanding something.

I want to use -stcox- but don’t know how to start.

I think I want to use -stcox-, but I’m unsure.

I want to use -stcox- but my data is complicated and I’m unsure how to proceed.

I have a complicated dataset that I somehow need to transform into a form suitable for use with -stcox-.

Stata crashed!

I’m having a problem that may be more of a statistics issue than a Stata issue.

The purpose of the first sentence is to catch the attention of members who have an interest in your topic and let the others, who were never going to answer you anyway, move on.

The second sentence is even easier to write:

I am using Stata 11.1 for Windows.

I am using Stata 10 for Mac.

Even if you are certain that it’s unimportant which version of Stata you are using, state it anyway.

Write two sentences and you are done with the first paragraph.

The second paragraph

Now write more about your problem. Try not to be overly wordy, but it’s better to be wordy than curt to the point of unclearness. However you write this paragraph, be explicit. If you’re having a problem making Stata work, tell your readers exactly what you typed and exactly how Stata responded. For example,

I typed -stcox weight- and Stata responded “data not st”, r(119).

I typed -stcox weight sex- and Stata showed the usual output, except the standard error on weight was reported as dot.

The form of the second paragraph — which may extend into the third, fourth, … — depends on what you are asking. Describe the problem concisely but completely. Sacrifice conciseness for completeness if you must or you think it will help. To the extent possible, simplify your problem by getting rid of extraneous details. For instance,

I have 100,000 observations and 1,000 variables on firms, but 4
observations and 3 variables will be enough to show the problem.
My data looks like this

        firm_id     date      x
        -----------------------
          10043       17     12
          10043       18      5
          13944       17     10
          27394       16      1
        -----------------------

I need data that looks like this:

        date    no_of_firms   avg_x
        ---------------------------
          16              1       1
          17              2      11
          18              1      12

That is, for each date, I want the number of firms and the
average value of x.

Here’s another example for the second and subsequent paragraphs:

The substantive problem is this:  Patients enter and leave the 
hospital, sometimes more than once over the period.  I think  
in this case it would be appropriate to combine the 
separate stays so that a patient who was in for 2 days and 
later for 4 days could be treated as being simply in for 6 days,  
except I also record how many separate stays there were, too.

I'm evaluating cost, so for my purposes, I think treating 
cost as proportional to days in hospital, whatever their
distribution, will be adequate.  I'm looking at total days as a
function of number of stays.  The idea is that letting patients out
too early results in an increase in total days, and I want to
measure this.

I realize that more stays and days might also arise simply because
the patient was sicker.  Some patients die, and that obviously 
truncates stay, so I've omitted them from data.  I have disease
codes, but nothing about health status within code.  

Is there a way to incorporate this added information to improve  
the estimates?  I've got lots of data, so I was thinking of 
using death rate within disease code to somehow rank the codes 
as to seriousness of illness, and then using "seriousness" 
as an explanatory variable.  I guess my question is whether  
anyone knows a way I might do this. 

Or is there someway I could estimate the model seperately within
disease code, somehow constraining the coefficient on number of 
stays to be the same?  I saw something in the manual about 
stratified estimates, but I'm unsure if this is the same thing.

You’re asking someone to invest their time, so invest yours

Before you hit the send key, read what you have written, and improve it. You are asking for someone to invest their time helping you. Help them by making your problem easy to understand.

The easier your problem is to understand, the more likely you are to get a response. Said differently, if you write in a disorganized way so that potential responders must work just to understand you, much less provide you with an answer, you are unlikely to get an response.

Sparkling prose is not required. Proper grammar is not even required, so nonnative English speakers can relax. My advice is that, unless you are often praised for how clearly and entertainingly you write, write short sentences. Organization is more important than the style of the individual setences.

Avoid or explain jargon. Do not assume that the person who responds to your question will be in the same field as you. When dealing with a substantive problem, avoid jargon except for statistical jargon that is common across fields, or explain it. Potential responders like it when you teach them something new, and that makes them more likely to respond.

Tone

Write as if you are writing to a colleague whom you know well. Assume interest in your problem. The same thing said negatively: Do not write to list members as you might write to your research assistant, employee, servant, slave, or family member. Nothing is more likely to to get you ignored than to write, “I’m busy and really I don’t have time filter through all the Statalist postings, so respond to me directly, and soon. I need an answer by tomorrow.”

The positive approach, however, works. Just as when writing to a colleague, in general you do not need to apologize, beg, or play on sympathies. Sometimes when I write to colleagues, I do feel the need to explain that I know what I’m asking is silly. “I should know this,” I’ll write, or, “I can’t remember, but …”, or, “I know I should understand, but I don’t”. You can do that on Statalist, but it’s not required. Usually when I write to colleagues I know well, I just jump right in. The same rule works with Statalist.

What’s appropriate

Questions appropriate for Stata’s Technical Services are not appropriate for Statalist, and vice versa. Some questions aren’t appropriate for either one, but those are rare. If you ask an inappropriate question, and ask it well, someone will usually direct you to a better source.

Who can ask, and how

You must join Statalist to send questions. Yes, you can join, ask a question, get your answer, and quit, but if you do, don’t mention this at the outset. List members know this happens, but if you mention it when you ask the question, you’ll sound superior and condescending. Also, stick around for a few days after you get your response, because sometimes your question will generate discussion. If it does, you should participate. You should want to stick around and participate because if there is subsequent discussion, the final result is usually better than the initial reply.

I’ve previously written on how to join (and quit) Statalist. See http://blog.stata.com/2010/11/08/statalist/.

Categories: Resources Tags:

2010 Italian Stata meeting recap

David Drukker and I just got back from the Italian Stata Users Group meeting in Bologna, arranged by TStat, the Stata distributor for Italy. It was wonderful, in part because of the beauty of Bologna, and the tasty food. The scientific committee and TStat did great jobs of selecting papers and organizing a smooth, interesting meeting.

The first day of the meeting had talks by users and StataCorp. There was good variety, with topics like investigating disease clustering, classification of prehistoric artifacts, small-area analysis, and the careful interpretation of marginal effects. This year, all the talks were in English — and it was once again amazing to see how well people can present in a second (or third) language. If you would like to see the slides which accompanied the talks, you can find them at http://www.stata.com/meeting/italy10/abstracts.html.

Recently, I have been thinking about how to interpret results from nonlinear models, so I found Maarten Buis’s talk on “Extracting effects from non-linear models” and David’s talk on “Estimating partial effects using margins in Stata 11” really useful. Both Maarten and David have thought carefully about this problem and each of them presented great introductions and easy to apply solutions. What is interesting is they favor different solutions. Maarten leaned more towards estimating and interpreting ratios that did not vary with the covariates. David recommended using the potential outcome framework which can be implemented using the margins command. The similarities and differences in these two talks made them even more informative.

As is typical for the Italian meetings, the second day had two training sessions, one given by David on programming your own estimation command in Stata (starting from the basics of Stata programming), and one given by Laura Antolini from the Università di Milano Bicocca on competing risks in survival analysis. Both courses were booked full.

I was a Stata user for 15 years before I started working at Stata, and the most fun parts of the meeting are the same now as when I was a user: the wishes and grumbles followed by the conference dinner. The wishes and grumbles session is always interesting; it shows the wide variety of approaches to using Stata. The conference dinner is always fun, because of the conversation over excellent food. In Italy, of course, the food is beyond excellent; strolling through Bologna on marble sidewalks under colonnades while talking statistics, programming and Stata made the evening, if in a intellectual fashion.

Statalist

I just want to take a moment to plug Statalist. I’m a member and I hope to convince you to join Statalist, too, but even if I don’t succeed, you need to know about the web-based Statalist Archives because they’re a great resource for finding answers to questions about Stata, and you don’t have to join Statalist to access them.

Statalist’s Archives are found at http://www.stata.com/statalist/archive/, or you can click on “Statalist archives” on the right of this blog page, under Links.

Once at the Archives page, you can click on a year and month to get an idea of the flavor of Statalist. More importantly, you can search the archives. The search is Powered by Google and works well for highly specific, directed inquiries. For generic searches such as random numbers or survival analysis, however, I prefer to go to Advanced Search and ask that the results be sorted by date instead of relevance. It’s usually the most recent postings that are the most interesting, and by-date results are listed in just that order.

Anyway, the next time you are puzzling over something in Stata, I suggest that Read more…

Categories: Resources Tags:

2011 Mexican Stata Users Group meeting — call for presentations

The 2011 Mexican Stata Users Group meeting has been scheduled for May 12, 2011.

The Mexican Stata Users Group meeting is a one-day international conference about the use of Stata in a wide breadth of fields and environments, mixing theory and practice. The bulk of the conference is made up of selected submitted presentations. Together with the keynote address and a featured presentation by a member of StataCorp’s technical staff, these sessions provide fertile ground for learning about statistics and Stata. All users are encouraged to submit abstracts for possible presentations.

For the full meeting details, submission guidelines, and registration information, please see www.stata.com/meeting/mexico11/.

Date: May 12, 2011
Venue: Institute for Economic Research, National Autonomous University of
Mexico, Circuito Mario de la Cueva, Ciudad de la Investigación
Humanidades, Ciudad Universitaria, C.P.04510, México, D.F.
Submission deadline: March 19, 2011
More information: click here

Scientific committee:

Alfonso Miranda (chair)
Institute of Education, University of London
Email: A.Miranda@ioe.ac.uk

Armando Sánchez Vargas
Institute for Economic Research, National Autonomous University of Mexico
Email: sva@economia.unam.mx

Graciela Teruel Belismelis
Economics Department, Iberoamerican University
Email: graciela.teruel@s2.uia.mx


Logistics organizer:

MultiON Consulting SA de CV, distributor of Stata in Mexico and Central America
Victoria Leon
Email: vleon@multion.com.mx
Phone: +52 (55) 5559 4050 x 160

Visit Stata at APHA 2010

StataCorp will have a booth in the exhibit hall at the American Public Health Association’s Annual Meeting & Exposition.

APHA’s 2010 meeting will be in Denver, Colorado, from November 6 through 10. For more information, visit www.apha.org/meetings/highlights/.

Stata representatives, including Roberto G. Gutierrez, Director of Statistics, will be available at the Stata booth to answer your questions about all things Stata. Stop by booth #1603 to visit with the people who develop and support the software and to get 20% off your purchase of Stata Press books and Stata Journal subscriptions.

Also make plans to attend our brief seminar, “Stata 11: Statistical Software for the Health Sciences”, led by Roberto G. Gutierrez. The workshop will be held Monday, November 8, at 4:00 PM in the APHA Exhibitor Theatre, booth #2093.