Epidemiology & Technology

Stata and ML resources

Some links to Stata and ML resources

Conference Articles / Presentations / Stata Journal

Stata Blog Articles from 2020

Other Resources

Seven Steps in developing a prediction model

Source: Towards better clinical prediction models: seven steps for development and an ABCD for validation – PMC (nih.gov)

  1. Problem definition and data inspection/ Research Question
    • What is the precise research question
    • How were patients selected
    • What is already known about the predictors? 
    • Define the Predictors
    • Were the predictors reliably and completely measured?
    • Define the outcomes of Interest
  2. Coding of predictors
    • Categorical predictors
    • Continuous predictors
  3. Model specification
    • Selection of main effects?
    • Assessment of assumptions?
    • Overfitting?
  4. Model estimation – Estimate model parameters
    • Shrinkage included ?
  5. Model performance:
    • Calibration: Caliberation plot
      • A: alpha – Calibration-in-the-large – Intercept in plot; the agreement between observed endpoints and predictions
      • B: beta – Calibration slope – Regression slope in plot; related to shrinkage of regression coefficients
    • Discrimination: the ability of the model to distinguish a patient with the endpoint from a patient without
      • Concordance C-statistic
        • ROC curve – For a binary endpoint, c is identical to the area under the receiver operating characteristic (ROC) curve
        • For a time-to-event endpoint, such as survival, the calculation of c may be affected by the amount of incomplete follow-up (censoring)
        • Probability of correct classification for a pair of subjects with and without the endpoint
        • A better discriminating model has more spread between the predictions than a poorly discriminating model
    • Clinical usefulness:
      • D – Decision-curve analysis – Net true-positive classification rate by using a model over a range of thresholds –
      • Net benefit (NB)
  6. Model validation
    • Internal validity
    • External validity
    • Techniques used: split smaple, cross-valdiation, etc
  7. Model presentation

Related posts