As everyone knows, I am a big proponent of Statalist, and not just for selfish reasons, although those reasons play a role. Nearly every member of the technical staff at StataCorp — me included — are members of Statalist. Even when we don’t participate in a particular thread, we do pay attention. The discussions on Statalist play an important role concerning Stata’s development.
Statalist is a discussion group, not just a question-and-answer forum. Nonetheless, new members often use it to obtain answers to questions and that works because those questions sometimes become gist for subsequent discussions. In those cases, the questioners not only get answers, they get much more.
One of the best features of Statalist is that, no matter how poorly you ask a question, you are unlikely to be flamed. Not only are the members of Statalist nice — just as are the members of most lists — they act just as nice on the list as they really are. You are unlikely to be flamed if you ask a question poorly, but you are also unlikely to get an answer.
Here is my recipe to increase the chances of you getting a helpful response. You should also read the Statalist FAQ before writing your question.
Make the subject line of your email meaningful. Some good subject lines are:
Confusion about -stcox-
Unexpected error from -stcox-
The first two sentences
The first two sentences are the most important, and they are the easiest to write.
In the first sentence, state your problem in Stata terms, but do not go into details. Here are some good first sentences:
I’m having a problem with -stcox-.
I’m getting an unexpected error message from -stcox-.
I’m using -stcox- and don’t know how to interpret the result.
I’m using -stcox- and getting a result I know is wrong, so I know I’m misunderstanding something.
I want to use -stcox- but don’t know how to start.
I think I want to use -stcox-, but I’m unsure.
I want to use -stcox- but my data is complicated and I’m unsure how to proceed.
I have a complicated dataset that I somehow need to transform into a form suitable for use with -stcox-.
I’m having a problem that may be more of a statistics issue than a Stata issue.
The purpose of the first sentence is to catch the attention of members who have an interest in your topic and let the others, who were never going to answer you anyway, move on.
The second sentence is even easier to write:
I am using Stata 11.1 for Windows.
I am using Stata 10 for Mac.
Even if you are certain that it’s unimportant which version of Stata you are using, state it anyway.
Write two sentences and you are done with the first paragraph.
The second paragraph
Now write more about your problem. Try not to be overly wordy, but it’s better to be wordy than curt to the point of unclearness. However you write this paragraph, be explicit. If you’re having a problem making Stata work, tell your readers exactly what you typed and exactly how Stata responded. For example,
I typed -stcox weight- and Stata responded “data not st”, r(119).
I typed -stcox weight sex- and Stata showed the usual output, except the standard error on weight was reported as dot.
The form of the second paragraph — which may extend into the third, fourth, … — depends on what you are asking. Describe the problem concisely but completely. Sacrifice conciseness for completeness if you must or you think it will help. To the extent possible, simplify your problem by getting rid of extraneous details. For instance,
I have 100,000 observations and 1,000 variables on firms, but 4
observations and 3 variables will be enough to show the problem.
My data looks like this
firm_id date x
10043 17 12
10043 18 5
13944 17 10
27394 16 1
I need data that looks like this:
date no_of_firms avg_x
16 1 1
17 2 11
18 1 12
That is, for each date, I want the number of firms and the
average value of x.
Here’s another example for the second and subsequent paragraphs:
The substantive problem is this: Patients enter and leave the
hospital, sometimes more than once over the period. I think
in this case it would be appropriate to combine the
separate stays so that a patient who was in for 2 days and
later for 4 days could be treated as being simply in for 6 days,
except I also record how many separate stays there were, too.
I'm evaluating cost, so for my purposes, I think treating
cost as proportional to days in hospital, whatever their
distribution, will be adequate. I'm looking at total days as a
function of number of stays. The idea is that letting patients out
too early results in an increase in total days, and I want to
I realize that more stays and days might also arise simply because
the patient was sicker. Some patients die, and that obviously
truncates stay, so I've omitted them from data. I have disease
codes, but nothing about health status within code.
Is there a way to incorporate this added information to improve
the estimates? I've got lots of data, so I was thinking of
using death rate within disease code to somehow rank the codes
as to seriousness of illness, and then using "seriousness"
as an explanatory variable. I guess my question is whether
anyone knows a way I might do this.
Or is there someway I could estimate the model seperately within
disease code, somehow constraining the coefficient on number of
stays to be the same? I saw something in the manual about
stratified estimates, but I'm unsure if this is the same thing.
You’re asking someone to invest their time, so invest yours
Before you hit the send key, read what you have written, and improve it. You are asking for someone to invest their time helping you. Help them by making your problem easy to understand.
The easier your problem is to understand, the more likely you are to get a response. Said differently, if you write in a disorganized way so that potential responders must work just to understand you, much less provide you with an answer, you are unlikely to get an response.
Sparkling prose is not required. Proper grammar is not even required, so nonnative English speakers can relax. My advice is that, unless you are often praised for how clearly and entertainingly you write, write short sentences. Organization is more important than the style of the individual setences.
Avoid or explain jargon. Do not assume that the person who responds to your question will be in the same field as you. When dealing with a substantive problem, avoid jargon except for statistical jargon that is common across fields, or explain it. Potential responders like it when you teach them something new, and that makes them more likely to respond.
Write as if you are writing to a colleague whom you know well. Assume interest in your problem. The same thing said negatively: Do not write to list members as you might write to your research assistant, employee, servant, slave, or family member. Nothing is more likely to to get you ignored than to write, “I’m busy and really I don’t have time filter through all the Statalist postings, so respond to me directly, and soon. I need an answer by tomorrow.”
The positive approach, however, works. Just as when writing to a colleague, in general you do not need to apologize, beg, or play on sympathies. Sometimes when I write to colleagues, I do feel the need to explain that I know what I’m asking is silly. “I should know this,” I’ll write, or, “I can’t remember, but …”, or, “I know I should understand, but I don’t”. You can do that on Statalist, but it’s not required. Usually when I write to colleagues I know well, I just jump right in. The same rule works with Statalist.
Questions appropriate for Stata’s Technical Services are not appropriate for Statalist, and vice versa. Some questions aren’t appropriate for either one, but those are rare. If you ask an inappropriate question, and ask it well, someone will usually direct you to a better source.
Who can ask, and how
You must join Statalist to send questions. Yes, you can join, ask a question, get your answer, and quit, but if you do, don’t mention this at the outset. List members know this happens, but if you mention it when you ask the question, you’ll sound superior and condescending. Also, stick around for a few days after you get your response, because sometimes your question will generate discussion. If it does, you should participate. You should want to stick around and participate because if there is subsequent discussion, the final result is usually better than the initial reply.
I’ve previously written on how to join (and quit) Statalist. See http://blog.stata.com/2010/11/08/statalist/.