Home > Resources > How to successfully ask a question on Statalist

## How to successfully ask a question on Statalist

As everyone knows, I am a big proponent of Statalist, and not just for selfish reasons, although those reasons play a role. Nearly every member of the technical staff at StataCorp — me included — are members of Statalist. Even when we don’t participate in a particular thread, we do pay attention. The discussions on Statalist play an important role concerning Stata’s development.

Statalist is a discussion group, not just a question-and-answer forum. Nonetheless, new members often use it to obtain answers to questions and that works because those questions sometimes become gist for subsequent discussions. In those cases, the questioners not only get answers, they get much more.

One of the best features of Statalist is that, no matter how poorly you ask a question, you are unlikely to be flamed. Not only are the members of Statalist nice — just as are the members of most lists — they act just as nice on the list as they really are. You are unlikely to be flamed if you ask a question poorly, but you are also unlikely to get an answer.

Here is my recipe to increase the chances of you getting a helpful response. You should also read the Statalist FAQ before writing your question.

Subject line

Make the subject line of your email meaningful. Some good subject lines are:

Survival analysis

Unexpected error from -stcox-

-stcox- output

The first two sentences

The first two sentences are the most important, and they are the easiest to write.

In the first sentence, state your problem in Stata terms, but do not go into details. Here are some good first sentences:

I’m having a problem with -stcox-.

I’m getting an unexpected error message from -stcox-.

I’m using -stcox- and don’t know how to interpret the result.

I’m using -stcox- and getting a result I know is wrong, so I know I’m misunderstanding something.

I want to use -stcox- but don’t know how to start.

I think I want to use -stcox-, but I’m unsure.

I want to use -stcox- but my data is complicated and I’m unsure how to proceed.

I have a complicated dataset that I somehow need to transform into a form suitable for use with -stcox-.

Stata crashed!

I’m having a problem that may be more of a statistics issue than a Stata issue.

The purpose of the first sentence is to catch the attention of members who have an interest in your topic and let the others, who were never going to answer you anyway, move on.

The second sentence is even easier to write:

I am using Stata 11.1 for Windows.

I am using Stata 10 for Mac.

Even if you are certain that it’s unimportant which version of Stata you are using, state it anyway.

Write two sentences and you are done with the first paragraph.

The second paragraph

Now write more about your problem. Try not to be overly wordy, but it’s better to be wordy than curt to the point of unclearness. However you write this paragraph, be explicit. If you’re having a problem making Stata work, tell your readers exactly what you typed and exactly how Stata responded. For example,

I typed -stcox weight- and Stata responded “data not st”, r(119).

I typed -stcox weight sex- and Stata showed the usual output, except the standard error on weight was reported as dot.

The form of the second paragraph — which may extend into the third, fourth, … — depends on what you are asking. Describe the problem concisely but completely. Sacrifice conciseness for completeness if you must or you think it will help. To the extent possible, simplify your problem by getting rid of extraneous details. For instance,

I have 100,000 observations and 1,000 variables on firms, but 4
observations and 3 variables will be enough to show the problem.
My data looks like this

firm_id     date      x
-----------------------
10043       17     12
10043       18      5
13944       17     10
27394       16      1
-----------------------

I need data that looks like this:

date    no_of_firms   avg_x
---------------------------
16              1       1
17              2      11
18              1      12

That is, for each date, I want the number of firms and the
average value of x.


Here’s another example for the second and subsequent paragraphs:

The substantive problem is this:  Patients enter and leave the
hospital, sometimes more than once over the period.  I think
in this case it would be appropriate to combine the
separate stays so that a patient who was in for 2 days and
later for 4 days could be treated as being simply in for 6 days,
except I also record how many separate stays there were, too.

I'm evaluating cost, so for my purposes, I think treating
cost as proportional to days in hospital, whatever their
distribution, will be adequate.  I'm looking at total days as a
function of number of stays.  The idea is that letting patients out
too early results in an increase in total days, and I want to
measure this.

I realize that more stays and days might also arise simply because
the patient was sicker.  Some patients die, and that obviously
truncates stay, so I've omitted them from data.  I have disease
codes, but nothing about health status within code.

Is there a way to incorporate this added information to improve
the estimates?  I've got lots of data, so I was thinking of
using death rate within disease code to somehow rank the codes
as to seriousness of illness, and then using "seriousness"
as an explanatory variable.  I guess my question is whether
anyone knows a way I might do this.

Or is there someway I could estimate the model seperately within
disease code, somehow constraining the coefficient on number of
stays to be the same?  I saw something in the manual about
stratified estimates, but I'm unsure if this is the same thing.


You’re asking someone to invest their time, so invest yours

Before you hit the send key, read what you have written, and improve it. You are asking for someone to invest their time helping you. Help them by making your problem easy to understand.

The easier your problem is to understand, the more likely you are to get a response. Said differently, if you write in a disorganized way so that potential responders must work just to understand you, much less provide you with an answer, you are unlikely to get an response.

Sparkling prose is not required. Proper grammar is not even required, so nonnative English speakers can relax. My advice is that, unless you are often praised for how clearly and entertainingly you write, write short sentences. Organization is more important than the style of the individual setences.

Avoid or explain jargon. Do not assume that the person who responds to your question will be in the same field as you. When dealing with a substantive problem, avoid jargon except for statistical jargon that is common across fields, or explain it. Potential responders like it when you teach them something new, and that makes them more likely to respond.

Tone

Write as if you are writing to a colleague whom you know well. Assume interest in your problem. The same thing said negatively: Do not write to list members as you might write to your research assistant, employee, servant, slave, or family member. Nothing is more likely to to get you ignored than to write, “I’m busy and really I don’t have time filter through all the Statalist postings, so respond to me directly, and soon. I need an answer by tomorrow.”

The positive approach, however, works. Just as when writing to a colleague, in general you do not need to apologize, beg, or play on sympathies. Sometimes when I write to colleagues, I do feel the need to explain that I know what I’m asking is silly. “I should know this,” I’ll write, or, “I can’t remember, but …”, or, “I know I should understand, but I don’t”. You can do that on Statalist, but it’s not required. Usually when I write to colleagues I know well, I just jump right in. The same rule works with Statalist.

What’s appropriate

Questions appropriate for Stata’s Technical Services are not appropriate for Statalist, and vice versa. Some questions aren’t appropriate for either one, but those are rare. If you ask an inappropriate question, and ask it well, someone will usually direct you to a better source.

Who can ask, and how

You must join Statalist to send questions. Yes, you can join, ask a question, get your answer, and quit, but if you do, don’t mention this at the outset. List members know this happens, but if you mention it when you ask the question, you’ll sound superior and condescending. Also, stick around for a few days after you get your response, because sometimes your question will generate discussion. If it does, you should participate. You should want to stick around and participate because if there is subsequent discussion, the final result is usually better than the initial reply.

I’ve previously written on how to join (and quit) Statalist. See http://blog.stata.com/2010/11/08/statalist/.

Categories: Resources Tags:
• Aramesh_sch

how could I enter a fix variable in the clogit. in output the variable is omited

• Anonymous

I am trying to update my STATA Intercool 9.0 and after typing ‘update query’ and ‘update all’, I receive a r.(603) error message after downloading all the files and during the installation stage. I have never updated it since 2005 because I have not used it extensively.

Below is the output. How can I ensure proper installation of the updates?

. help whatsnew

. update query
(contacting http://www.stata.com)

Stata executable
folder:               C:Program FilesStata9
name of file:         wstata.exe
currently installed:  05 Jul 2005
latest available:     20 Jul 2007

names of files:       (various)
currently installed:  05 Jul 2005
latest available:     20 Jul 2007

Comment
Stata 10, a new release, is available.
For details, point your browser at
http://www.stata.com/stata10/

Recommendation
Type -update all-

…..
….

4.  examining files

5.  installing files

r(603);

I want to estimate a meta frontier stochastic production fonction but I ignore to read shazam program to this end. I want some help on who to read a shazam program

• Mario Marques

I am working with count data and I need to do my estimates using zero truncated negative binomial model, however results do not converge. How can I solve this.

• HayY

Nice one. I want to run Meta-frontier analysis. Can someone help me please?

• aloff

Hi everyone! I was wondering if you could help me I got this output and it seems strange….

What is wrong with this output? chi2 values???
Thank you a ton!

Structural |

t4_YARC_Acc_ASc_mean chi2 = .

. estat gof, stats (all)

—————————————————————————-

Fit statistic | Value Description

———————+——————————————————

Likelihood ratio |

chi2_ms(0) | 0.000 model vs. saturated

p > chi2 | .

chi2_bs(3) | 57.452 baseline vs. saturated

p > chi2 | 0.000

———————+——————————————————

Population error |

RMSEA | 0.000 Root mean squared error of approximation

90% CI, lower bound | 0.000

upper bound | 0.000

pclose | 1.000 Probability RMSEA <= 0.05

———————+——————————————————

Information criteria |

AIC | 2082.463 Akaike's information criterion

BIC | 2093.705 Bayesian information criterion

———————+——————————————————

Baseline comparison |

CFI | 1.000 Comparative fit index

TLI | 1.000 Tucker-Lewis index

———————+——————————————————

Size of residuals |

SRMR | 0.000 Standardized root mean squared residual

CD | 0.560 Coefficient of determination

• Alesandra

Hi. I’d like to run a Markov chain. I don’t even know how to organise data.

Thank you!

• Anand

Hi, I am doing censored quantile regression using the cqiv command. How can I retrieve the standard errors of estimates? Any kind of help is appreciated

• Luke

Hello everyone, I am having troubles with reshaping my data. The data currently looks as follows:

Firm Variable 1990 1991 1992 etc.

x 1

x 2

y 1

y 2

z 1

z 2

I would like the data to look as follows:

Firm Year Variable1 Variable2

x 1990

x 1991

x 1992

y 1990

y 1991

I started off with the command: reshape long y, i(Name Variable) j(Year)

Now I have all my data long.
Next I want to make my variables wide, I already encoded the variable names because these are string variables, with the command: reshape wide v, i(Name Year y) j(Variable).
But when I do this it screws up my database, it does not correctly assign my data across variables, firms and year.
If anybody could help, it would really be appreciated.

• Kat

Hey everyone, I have a bit of trouble with plotting my continuous interactions using the marginsplot command after xt logit. The coefficient for the interaction comes up positive, but when I use margins and then marginsplot, the actual graph indicates a negative effect, whcih is quite confusing. Variables are coded correctly, I have checked that. I presume it must be a flaw in my margins command. I use:

margins, predict(pu0) at(var1=(x (x) x) var2=(x (x) x))

I have also tried:

margins c.var1#c.var2

But Stata gives me an error message ‘variable c1 not found r(111)’

Any suggestions are very much appreciated. Thanks very much in advance for your help with this.

• PMC

Q1. Is there a limit of countries to be used for cross-sectional analysis?
Q2. Can I use 7 countries across 20 years for panel regression?

• MCM

I want to know if it’s possible to insert both short-run and long-run restrictions in SVAR on stata (12)

• Michael Sikivie

As simple as it sounds, I’m having trouble merging. 0 observations are merging. I’m merging on the variable iso_o and as far as I can see every observation in the master dataset has a value of iso_o that’s spot on identical with the value for some observation in the using set. For example, after merging there are observations with iso_o==AFG in the master only category as well as an observation of iso_o==AFG in the using only category but none matched. I checked that they’re both ASCII in Excel by using the function code and they both give the same ASCII code. Although it’s a string, using the encode command in Stata and using the new variable generated doesn’t work, as well.

sort iso_o year
merge m:1 iso_o using numeric_letter_code

* The using data set really is called numeric_letter_code. Long story.

Result # of obs.
—————————————–
not matched 11,588
from master 11,400 (_merge==1)
from using 188 (_merge==2)

matched 0 (_merge==3)
—————————————–

• Anne-Laure

Confusion between the options “robust” and “cluster”.

If there is heteroskedasticity, there is the option “robust”. And to have an estimator that is robust to the correlation of disturbances within groups, I can use the option “cluster”. So the option “cluster” allows the correlation of disturbances within cluster. In my case, I would say that my residuals are probably correlated between individuals in the same country. But I already use a country fixed effect, so my residuals should not be anymore correlated between individuals in the same country. Can I say that the option “cluster” is not useful anymore, so I should use only the “robust” option?

PS: My dependent variable is qualitative. I am using 3 models: a Linear Probability Model, a Logit and a Probit.

• Noam

I think I want to use “dirifit” to test differences between four dependent proportions (e.g., the proportion of presses on each one of four relevant keys during a task) but without another explanatory variable (e.g., gender), does anyone can advise me what is the correct Stata command?

In my data each participant has four dependent proportions e.g., .30, .25, .10, 35 (sum=1). what significant test and Stata command should I use to test the difference between those proportions?

thank you!

Noam

• Hello everyone,

We see that quite a few people have been posting questions after this blog entry.

The comment fields after the blog entries here are for questions regarding each particular blog entry — not for general Stata questions. General Stata questions should typically be submitted to Stata Technical Services or to Statalist.

It would be better for the questions below to be directed either to Stata Technical Services (see http://www.stata.com/support/tech-support/ for details on how to contact Technical Services) or to Statalist (read this blog entry to determine whether your question is appropriate for Statalist, and if so, how to post it, or whether it should be directed to Stata Technical Services).

Thank you!

• Oonagh Jones

hello. i am trying to find stata results for my project.. i have choosen to do it on this.http://fmwww.bc.edu/ec-p/data/wooldridge/phillips.des. i need to find the robust regression of the time series but everything i type is wrong. can someone help me? thank you

• Sunny Singh

Hello, I’m trying to do rolling regression for the nonlinear equation (exponential). My functional form (stata form) for nonlinear equation is:

nl(weeklygrowth=({alpha1}+{alpha2}*day+{alpha3}*day2+{alpha4}*day3+{alpha5}*day4{a1}*m1+ {a2}*m2+ {a3}*m3+{a4}*m4+{a5}*m5+{a6}*m6+{a7}*m7+{a8}*m8+{a9}*m9+{a10}*m10+{a11}*m11)exp(-1{beta1=0.0005}*t))

Please tell me how to do rolling regression for this equation (window=522).

Best Regards, Sunny

• vincent

am getting problem my stata11.1 version have no mdraws and egefunction mvp i need your help

TEST FOR EVOLVING EFFICIENCY

I would be grateful if you could assist me to undertake the ”test for evolving efficiency (TEE)” in stata, as used in the paper “THE CHANGING EFFICIENCY OF AFRICAN STOCK MARKETS” by Smith and Jefferis (2005) South African Journal of Economics Vol. 73:1 by kindly providing me with the syntax. I have tried using sspace estimation but could not draw the graphs showing evolving efficiency

• Hana

Latent Class Regression
What are the Stata commands for latent Regression ( LCM) if my dependent variable is discrete with value of ( 0 or 1) ?

• bill

i am trying two days now to post a thread . however when i send an email a get a message saying my message was bounced. what am i doing wrong? this is what i want to send

this is my first time using this service so apologies

if i make something wrong. i am trying to replicate

the results of the paper of mody and taylor 2003.

in this paper they have data on real industrial

production and want to see if a certain spread

can predict the growth in industrial production

when it is stripped of alternately its demand side

and supply side components. i get to understand

this is dome kind of decomposition. unfortunately i

have no idea how to do it. any help would be very

much appreciated. is there a command i should use?

thank you in advance

to tell you the truth so far this community was a big dissapointment. why should it be so difficult to send a message? a simple message of 8 sentences?? anyway if you could let me know what i am doing wrong sending an email please do.

• Omnia Mansour

Imputing missing values

Hi

I have a problem in imputing missing values in my panel dataset, all my variables except for one have missing values!

The problem occurs when I used usual impute, some of the imputed values are negative for variables that cant take negative values “unemployment, taxes” and another problem occurs with the imputed values for a DISCRETE variable of specified range “taking vaules… (0, 1 , 2, …6)
I am confused, how to impute missing values in Rational pattern ( +ve and within range also).

Thanks for your time

• yass

how calculate a rooling window percentile of a arry of range data ? i.e. how can use the same commande “Percentile(array, k) of excel in Stata”) to have a liste of percentile. the question i have a liste of 1000 observation and i wld to calculate of each rolling 250 data the 99th percnetile? tks.

• KUMAORON -OM-

Hello,
I am working on national household survey data. I constructed the consumption aggregate and estimated the per capita consumption per annum. The entire dataset is household level data. The data is also weighted (household weights). The data has health insurance coverage per household (i.e. number of persons covered by health insurance per household). I have divided the population into quintiles (using the command: xtile quintile=cons_pc[aw=weights], n(5)). The problem is I want to use stata to calculate health insurance coverage in the entire population across quintiles applying the household weights. I have tried the command: tab quintile nhis[aw=weights], but it doesn’t give me exactly what I am looking for. I gives me values at the household level.

Thanks.

• Andrea

Hello! I urgently need your help pleeeeaseeee…i am trying to make a ttest of 2 variables and have coded it like this:
——————————–
use “I_final.dta”

append using “JP_final.dta”

drop if year!= 2004 & year!=2005

gen Group = (nation == “ITALY”)
tabulate Group, gen(g)

gen Time = (year == 2005)
tabulate Time, gen(t)

////////////////////////ttest Difference1/////////////////////////////////
gen Variable1=.
replace Variable1=ZERORET[_n] if Group==1 & Time == 1
replace Variable1=0 if Variable1==.

mvdecode Variable1, mv(0=.a)

gen Variable2=.
replace Variable2=ZERORET[_n] if Group==1 & Time == 0
replace Variable2=0 if Variable2==.

mvdecode Variable2, mv(0=.b)

ttest Variable1=Variable2

——————————–
Variable1 just contains the values of ZERORET in the case of group=1 and time=1. Variable2 just contains the values of ZERORET in the case of group=1 and time=0. The other values are going to be zero. For this reason, i have decoded the numeric “zero” in a missing value “.a” and “.b”
the problem appears in the last code: it is appearing “no observations”….although Variable 1 and Variable 2 have numeric values (and missing values, which i have coded as a missing value)

i am not getting the problem.

pleas i need your help.

Thank you very much for now!

Andrea

• Maria

Hi, anyone has the ado for pscore2????

• 1245

owqkrjwqi

• Jair Araujo

i want too.

• Belal Fallah

what is the command to estimate a propensity score match when the outcome is binary?

• chanarcisse

Afternoon to all the followers.
Please, I am blocked in my econometric analysis.
How can I conduct an ARDL time series analysis using stata13?
Especially, how can I conduct cointegration test using pesaran et al (2001) approch; I mean Bound Tetsting approach.

• chanarcisse

I would like if possible to have required stata commands

how to interpret the results of wald test after ivprobit

Wald test of exogeneity: Wald test of exogeneity: chi2(1) = 0.32 Prob > chi2 = 0.5716

what does this means ?

• Gaye del LO

Hello dear all,
I got these results of hausman and Hsiai IIA tests:

**** Hausman tests of IIA assumption (N=38555)

Ho: Odds(Outcome-J vs Outcome-K) are independent of other alternatives.

Omitted | chi2 df P>chi2 evidence
———+————————————
Bois | -1.824 14 — —
Electric | 72.157 14 0.000 against Ho
Gaz | -117.428 14 — —
———————————————-
Note: If chi2chi2 evidence
———+———————————————————
Bois | -1.86e+04 -1.85e+04 10.477 16 0.841 for Ho
Electric | -6825.505 -6816.923 17.165 16 0.375 for Ho
Gaz | -7606.779 -7599.185 15.189 16 0.511 for Ho
——————————————————————-
Can you conclude that IIA hypothesis is hold?
thanks for helps,
Gaye,

• robert prince

Hi there,
I need to find the number of days between dates, calculating from the first date
ID Date
1 11nov2006
1 26may2007
1 26may2007
1 30may2007

gen newvar= date-date[n-1] does NOT work since it will go back to 0 for 26may2007.

Does anyone know how to do this? Thanks