Situation: Yes / No or categorical outcomes, being compared across groups
0-1 coding: Ensure that categorical outcome and exposure variables are coded as :
0 = no, 1 = yes. While this is not required for Chi-square, logistic regression etc, It is a pre-requisite for using epidemiological analysis
cs commands that can provide results in the form of risk difference, Odds ratio, risk ratios etc.
Ensure that categorical groups are coded in increments of 1. What I mean to say is that
0=illiterate, 1=primary school, 3 = middle school is bad,
is good. How do you check it – use
0=illiterate, 1=primary school, 2 = middle school
tab1 var, nolabel
Understand the data
A key step is to understand which participant groups have higher or lower levels of outcomes.
tab outcomeVar groupVar, col
Check whether the two groups have same proportion of outcome
prtest outcomeVar, by(groupVar)
Hypothesis testing using Chi Square
tab outcomeVar groupVar, col chi
Use Exact tests if you get a message that one or more cells have an expected value of < 5
tab outcomeVar groupVar, col chi exact
Wondering that you are getting the same p value on Chi-square and prtest… well that is expected. The advantage of the prtest command is that you also get the 95% CIs of the proportions.
Comparing yes/no outcome across two groups only
Odds ratios Calculation:
cc outcomeVar groupVar or
logit pneumonia i.vaccine, or
Risk Ratio Calculation:
cs outcomeVar groupVar
Try These out !
preserve use https://www.stata-press.com/data/r17/pneumoniacrt, clear describe pneumonia vaccine codebook pneumonia vaccine count tab pneumonia vaccine, col tab pneumonia vaccine, col chi prtest pneumonia, by(vaccine) cc pneumonia vaccine cs pneumonia vaccine logit pneumonia i.vaccine, or restore
Comparing yes/no Outcome Across Three or more groups
In this case, we can use Mantel-haenzel techniques.
mhodds are your friends. Alternatively, you could just run a logistic regression .