How much IPD is enough?
Rationale
What?
“How large should my sample be?”
This is one of the most common questions a statistician is asked. Sample size and statistical power (the probability of correctly rejecting the null hypothesis when there truly is a difference in outcome between treatment and control groups) are commonplace topics of discussion when dealing with primary studies. IPD meta-analyses are often conducted to answer research questions that primary studies did not consider or were not powered to answer. However, sample size and power are often ignored when planning an IPD meta-analysis. This may be because it is assumed that by combining many primary studies, we will achieve increased power. While it is true that power will be increased, IPD meta-analysis does not guarantee sufficient statistical power to answer the research question of interest.
IPD meta-analyses are commonly undertaken to answer the question of whether a patient-level characteristic modifies a treatment effect, to identify subgroups of patients who may be at greater benefit (or harm) than others. Such stratified medicine is a major interest of clinical decision-makers and pharmaceutical companies, looking to identify those populations in whom treatment is more effective (or less harmful). A single trial is usually underpowered for this purpose. Brookes et al. show that if a single trial has 80% power to detect a particular treatment effect (across all patients), then its power to detect an interaction (with a binary covariate) with the same magnitude as the overall treatment effect will only be 29%. To ensure 80% power to detect the interaction, the sample size in a single trial needs to be increased by approximately four times. Furthermore, to have 80% power to detect an interaction term half the size of the overall treatment effect there needs to be an approximately 16-fold increase in sample size.
Why?
IPD meta-analyses are both time-consuming and expensive to perform, requiring significant resources to obtain, clean and harmonise the IPD from relevant trials before then synthesising them; a process that can take months or even years. Therefore, before embarking on an IPD project, researchers and funders should ensure that it is likely to be worth the effort. In particular, how many studies are likely to provide their IPD and, based on this, what is the potential power of the planned IPD meta-analysis? In our experience, power calculations and sample size justifications are rarely reported in IPD meta-analysis protocols or publications. Researchers are perhaps grateful for whatever IPD can be obtained, and appeal to any IPD meta-analysis adding value over a single trial. However, if it was known in advance that IPD from a particular number of studies would only increase power to 50%, then researchers and funders may think twice before undertaking the IPD project. Conversely, if a potential IPD meta-analysis increases the power to over 80%, then funders will be reassured that the IPD project is worth resourcing.
How?
Sample size determination is never easy, and formal power calculations for an IPD meta-analysis are particularly difficult as they depend on many factors, which perhaps explains why they are currently neglected. The IPD cannot be considered as coming from a single trial, and thus sample size calculations must account for the clustering of patients within trials and the potential heterogeneity (e.g. in baseline risk and treatment effects) between-trials. Also, the power depends on the choice and specification of analysis model (e.g. covariates to be included, number of parameters, magnitude of effects), and the parameter estimation method, amongst other factors.
Several approaches have been proposed to calculate the power to detect a subgroup effect in an IPD meta-analysis study. These approaches fall into two general categories: analytical solutions and simulation-based solutions. Simmonds et al., Kovalchik et al. and Riley et al. have each proposed algebraic solutions to estimate the power of an IPD meta-analysis. These simple algebraic solutions are computationally fast but are not straightforward unless simplifying assumptions are made. For this reason, Kontopantelis et al. and Ensor et al. have also proposed simulation-based approaches, where IPD meta-analysis datasets are simulated multiple times based on a chosen data-generating mechanism (including numbers of studies, effect sizes, and heterogeneity), and then a chosen IPD meta-analysis model is applied to each dataset, with subsequent results (e.g. estimates and confidence intervals) summarised over the multiple analyses. In particular, the proportion of all simulations that give a p-value < 0.05 can be calculated, to give an estimate of the power.