Clinical trials with missing data : a guide for practitioners

cover image

Where to find it

Health Sciences Library — Books (Basement)

Call Number
QV 771.4 O41c 2014
Status
Available

Authors, etc.

Names:

Summary

This book provides practical guidance for statisticians, clinicians, and researchers involved in clinical trials in the biopharmaceutical industry, medical and public health organisations. Academics and students needing an introduction to handling missing data will also find this book invaluable.

The authors describe how missing data can affect the outcome and credibility of a clinical trial, show by examples how a clinical team can work to prevent missing data, and present the reader with approaches to address missing data effectively.

The book is illustrated throughout with realistic case studies and worked examples, and presents clear and concise guidelines to enable good planning for missing data. The authors show how to handle missing data in a way that is transparent and easy to understand for clinicians, regulators and patients. New developments are presented to improve the choice and implementation of primary and sensitivity analyses for missing data. Many SAS code examples are included - the reader is given a toolbox for implementing analyses under a variety of assumptions.

Contents

  • Preface p. xv
  • References p. xvii
  • Acknowledgments p. xix
  • Notation p. xxi
  • Table of SAS code fragments p. xxv
  • Contributors p. xxix
  • 1 What's the problem with missing data? p. 1 Michael O'Kelly and Bohdana Ratitch
  • 1.1 What do we mean by missing data? p. 2
  • 1.1.1 Monotone and non-monotone missing data p. 3
  • 1.1.2 Modeling missingness, modeling the missing value and ignorability p. 4
  • 1.1.3 Types of missingness (MCAR, MAR and MNAR) p. 4
  • 1.1.4 Missing data and study objectives p. 5
  • 1.2 An illustration p. 6
  • 1.3 Why can't I use only the available primary endpoint data? p. 7
  • 1.4 What's the problem with using last observation carried forward? p. 9
  • 1.5 Can we just assume that data are missing at random? p. 11
  • 1.6 What can be done if data may be missing not at random? p. 14
  • 1.7 Stress-testing study results for robustness to missing data p. 15
  • 1.8 How the pattern of dropouts can bias the outcome p. 15
  • 1.9 How do we formulate a strategy for missing data? p. 16
  • 1.10 Description of example datasets p. 18
  • 1.10.1 Example dataset in Parkinson's disease treatment p. 18
  • 1.10.2 Example dataset in insomnia treatment p. 23
  • 1.10.3 Example dataset in mania treatment p. 28
  • Appendix 1.A Formal definitions of MCAR, MAR and MNAR p. 33
  • References p. 34
  • 2 The prevention of missing data p. 36 Sara Hughes
  • 2.1 Introduction p. 36
  • 2.2 The impact of "too much" missing data p. 37
  • 2.2.1 Example from human immunodeficiency virus p. 38
  • 2.2.2 Example from acute coronary syndrome p. 38
  • 2.2.3 Example from studies in pain p. 39
  • 2.3 The role of the statistician in the prevention of missing data p. 39
  • 2.3.1 Illustrative example from HIV p. 41
  • 2.4 Methods for increasing subject retention p. 48
  • 2.5 Improving understanding of reasons for subject withdrawal p. 49
  • Acknowledgments p. 49
  • Appendix 2.A Example protocol text for missing data prevention p. 49
  • References p. 50
  • 3 Regulatory guidance - a quick tour p. 53 Michael O'Kelly
  • 3.1 International conference on harmonization guideline: Statistical principles for clinical trials: E9 p. 54
  • 3.2 The US and EU regulatory documents p. 55
  • 3.3 Key points in the regulatory documents on missing data p. 55
  • 3.4 Regulatory guidance on particular statistical approaches p. 57
  • 3.4.1 Available cases p. 57
  • 3.4.2 Single imputation methods p. 57
  • 3.4.3 Methods that generally assume MAR p. 59
  • 3.4.4 Methods that are used assuming MNAR p. 60
  • 3.5 Guidance about how to plan for missing data in a study p. 62
  • 3.6 Differences in emphasis between the NRC report and EU guidance documents p. 63
  • 3.6.1 The term "conservative" p. 63
  • 3.6.2 Last observation carried forward p. 63
  • 3.6.3 Post hoc analyses p. 63
  • 3.6.4 Non-monotone or intermittently missing data p. 63
  • 3.6.5 Assumptions should be readily interpretable p. 65
  • 3.6.6 Study report p. 65
  • 3.6.7 Training p. 65
  • 3.7 Other technical points from the NRC report p. 66
  • 3.7.1 Time-to-event analyses p. 66
  • 3.7.2 Tipping point sensitivity analyses p. 66
  • 3.8 Other US/EU/international guidance documents that refer to missing data p. 66
  • 3.8.1 Committee for medicinal products for human use guideline on anti-cancer products, recommendations on survival analysis p. 66
  • 3.8.2 US guidance on considerations when research supported by office of human research protections is discontinued p. 67
  • 3.8.3 FDA guidance on data retention p. 67
  • 3.9 And in practice? p. 67
  • References p. 69
  • 4 A guide to planning for missing data p. 71 Michael O'Kelly and Bohdana Ratitch
  • 4.1 Introduction p. 72
  • 4.1.1 Missing data may bias trial results or make them more difficult to generalize to subjects outside the trial p. 72
  • 4.1.2 Credibility of trial results when there is missing data p. 74
  • 4.1.3 Demand for better practice with regard to missing data p. 74
  • 4.2 Planning for missing data p. 76
  • 4.2.1 The case report form and non-statistical sections of the protocol p. 76
  • 4.2.2 The statistical sections of the protocol and the statistical analysis plan p. 81
  • 4.2.3 Using historic data to narrow the choice of primary analysis and sensitivity analyses p. 88
  • 4.2.4 Key points in choosing an approach for missing data p. 108
  • 4.3 Exploring and presenting missingness p. 113
  • 4.4 Model checking p. 114
  • 4.5 Interpreting model results when there is missing data p. 116
  • 4.6 Sample size and missing data p. 117
  • Appendix 4.A Sample protocol/SAP text for study in Parkinson's disease p. 119
  • Appendix 4.B A formal definition of a sensitivity parameter p. 125
  • References p. 126
  • 5 Mixed models for repeated measures using categorical time effects (MMRM) p. 130 Sonia Davis
  • 5.1 Introduction p. 131
  • 5.2 Specifying the mixed model for repeated measures p. 132
  • 5.2.1 The mixed model p. 132
  • 5.2.2 Covariance structures p. 135
  • 5.2.3 Mixed model for repeated measures versus generalized estimating equations p. 139
  • 5.2.4 Mixed model for repeated measures versus last observation carried forward p. 140
  • 5.3 Understanding the data p. 141
  • 5.3.1 Parkinson's disease example p. 141
  • 5.3.2 A second example showing the usefulness of plots: The CATIE study p. 144
  • 5.4 Applying the mixed model for repeated measures p. 145
  • 5.4.1 Specifying the model p. 146
  • 5.4.2 Interpreting and presenting results p. 150
  • 5.5 Additional mixed model for repeated measures topics p. 162
  • 5.5.1 Treatment by subgroup and treatment by site interactions p. 162
  • 5.5.2 Calculating the effect size p. 164
  • 5.5.3 Another strategy to model baseline p. 166
  • 5.6 logistic regression mixed model for repeated measures using the generalized linear mixed model p. 168
  • 5.6.1 The generalized linear mixed model p. 168
  • 5.6.2 Specifying the model p. 170
  • 5.6.3 Interpreting and presenting results p. 173
  • 5.6.4 Other modeling options p. 181
  • References p. 182
  • Table of SAS Code Fragments p. 183
  • 6 Multiple imputation p. 185 Bohdanu Ratitch
  • 6.1 Introduction p. 185
  • 6.1.1 1 low is multiple imputation different from single imputation? p. 186
  • 6.1.2 How is multiple imputation different from maximum likelihood methods? p. 187
  • 6.1.3 Multiple imputation's assumptions about missingness mechanism p. 188
  • 6.1.4 A general three-step process for multiple imputation and inference p. 189
  • 6.1.5 Imputation versus analysis model p. 190
  • 6.1.6 Note on notation use p. 192
  • 6.2 Imputation phase p. 192
  • 6.2.1 Missing patterns: Monotone and non-monotone p. 192
  • 6.2.2 How do we get multiple imputations? p. 195
  • 6.2.3 Imputation strategies: Sequential univariate versus joint multivariate p. 197
  • 6.2.4 Overview of the imputation methods p. 199
  • 6.2.5 Reusing the multiply-imputed dataset for different analyses or summary scales p. 212
  • 6.3 Analysis phase: Analyzing multiple imputed datasets p. 213
  • 6.4 Pooling phase: Combining results from multiple datasets p. 216
  • 6.4.1 Combination rules p. 216
  • 6.4.2 Pooling analyses of continuous outcomes p. 219
  • 6.4.3 Pooling analyses of categorical outcomes p. 222
  • 6.5 Required number of imputations p. 227
  • 6.6 Some practical considerations p. 231
  • 6.6.1 Choosing an imputation model p. 231
  • 6.6.2 Multivariate normality p. 235
  • 6.6.3 Rounding and restricting the range for the imputed values p. 238
  • 6.6.4 Convergence of Markov chain Monte Carlo p. 240
  • 6.7 Pre-specifying details of analysis with multiple imputation p. 244
  • Appendix 6.A Additional methods for multiple imputation p. 245
  • References p. 251
  • Table of SAS Code Fragments p. 255
  • 7 Analyses under missing-not-at-random assumptions p. 257 Michael O'Kelly and Bohdana Ratitch
  • 7.1 Introduction p. 258
  • 7.2 Background to sensitivity analyses and pattern-mixture models p. 259
  • 7.2.1 The purpose of a sensitivity analysts p. 259
  • 7.2.2 Pattern-mixture models as sensitivity analyses p. 261
  • 7.3 Two methods of implementing sensitivity analyses via pattern-mixture models p. 264
  • 7.3.1 A sequential method of implementing pattern-mixture models with multiple imputation p. 264
  • 7.3.2 Providing stress-testing "what ifs" using pattern-mixture models p. 266
  • 7.3.3 Two implementations of pattern-mixture models for sensitivity analyses p. 267
  • 7.3.4 Characteristics and limitations of the sequential modeling method of implementing pattern-mixture models p. 268
  • 7.3.5 Pattern-mix lure models implemented using the joint modeling method p. 271
  • 7.3.6 Characteristics of the joint modeling method of implementing pattern-mixture models p. 279
  • 7.3.7 Summary of differences between the joint modeling and sequential modeling methods p. 281
  • 7.4 A "toolkit": Implementing sensitivity analyses via SAS p. 284
  • 7.4.1 Reminder: General approach using multiple imputation with regression p. 284
  • 7.4.2 Sensitivity analyses assuming withdrawals have trajectory of control arm p. 288
  • 7.4.3 Sensitivity analyses assuming withdrawals have distribution of control arm p. 292
  • 7.4.4 Baseline-observation-carried-forward-like and last-observation-carried-forward-like analyses p. 297
  • 7.4.5 The general principle of using selected subsets of observed data as the basis to implement "what if" stress tests p. 306
  • 7.4.6 Using a mixture of "what ifs," depending on reason for discontinuation p. 306
  • 7.4.7 Assuming trajectory of withdrawals is worse by some ¿: Delta adjustment and tipping point analysis p. 308
  • 7.5 Examples of realistic strategies and results for illustrative datasets of three indication p. 320
  • 7.5.1 Parkinson's disease p. 320
  • 7.5.2 Insomnia p. 323
  • 7.5.3 Mania p. 330
  • Appendix 7.A How one could implement the neighboring case missing value assumption using visit-by-visit multiple imputation p. 335
  • Appendix 7.B SAS code to model withdrawals from the experimental arm, using observed data from the control arm p. 336
  • Appendix 7.C SAS code to model early withdrawals from the experimental arm, using the last-observation-carried-forward-like values p. 342
  • Appendix 7.D SAS macro to impose delta adjustment on a responder variable in the mania dataset p. 345
  • Appendix 7.E SAS code to implement tipping point via exhaustive scenarios for withdrawals in the mania dataset p. 346
  • Appendix 7.F SAS code to perform sensitivity analyses for the Parkinson's disease dataset p. 348
  • Appendix 7.G SAS code to perform sensitivity analyses for the insomnia dataset p. 351
  • Appendix 7.H SAS code to perform sensitivity analyses for the mania dataset p. 356
  • Appendix 7.I Selection models p. 358
  • Appendix 7.J Shared parameter models p. 362
  • References p. 365
  • Table of SAS Code Fragments p. 368
  • 8 Doubly robust estimation p. 369 Belinda Hernández and Ilya Lipkovich and Michael O'Kelly
  • 8.1 Introduction p. 370
  • 8.2 Inverse probability weighted estimation p. 370
  • 8.2.1 Inverse probability weighting estimators for estimating equations p. 372
  • 8.2.2 Summary of inverse probability weighting advantages p. 373
  • 8.2.3 Inverse probability weighting disadvantages p. 373
  • 8.3 Doubly robust estimation p. 374
  • 8.3.1 Doubly robust methods explained p. 375
  • 8.3.2 Advantages of doubly robust methods p. 376
  • 8.3.3 Limitations of doubly robust methods p. 376
  • 8.4 Vansteelandt et al. method for doubly robust estimation p. 377
  • 8.4.1 Theoretical justification for the Vansteelandt et al. method p. 378
  • 8.4.2 Implementation of the Vansteelandt et al. method for doubly robust estimation p. 379
  • 8.5 Implementing the Vansteelandt et al method via SAS p. 383
  • 8.5.1 Mania dataset p. 383
  • 8.5.2 Insomnia dataset p. 390
  • Appendix 8.A How to implement Vansteelandt et at method for mania dataset (binary response) p. 392
  • Appendix 8.B SAS code to calculate estimates from the bootstrapped datasets p. 400
  • Appendix 8.C How to implement Vansteelandt etal method for insomnia dataset p. 401
  • References p. 408
  • Table of SAS Code Fragments p. 408
  • Bibliography p. 409
  • Index p. 423

Other details