Meta-analysis

  1. Introduction:

Meta-analysis is a systematic method for synthesizing quantitative results of different empirical studies regarding the effect of an independent variable. In our analysis we have data from 7 different studies of corticosteroid therapy in premature labour and its effect on neonatal death. We want to apply meta-analysis to find if the treatment applied had significant impact. We will apply Bayesian meta-analysis to find out.

  • Assumptions:

Here the effect that we are supposed to measure is the overall mean difference between the treatment group and control group. We have the data already well defined in terms of PICO (Participants, Interventions, Comparisons, Outcome) measure (which is the first step in a meta-analysis).

The next step in a meta-analysis would be to go through the bibliographical databases and read about each study to select the study. Our assumption here is this step is already completed and we will use all the 7 studies.

We assume that all measures are at same scale, so effect size measures across these studies are comparable. There are three possible assumptions about the data – I) No heterogeneity, II) Complete heterogeneity and III) Exchangeable. We will assume exchangeable model as this is a compromise between complete homogeneity and complete heterogeneity, and we assume there are no important covariates that might form the basis of a more complex mode.

There are two approaches – fixed effect or random effect model which we could be using. Since the study was done by different person, we can assume that not all are from the same population and not all will have the same method of intervention. Also, we don’t have answers to the questions like what procedure was followed in each of this study. We assume that differences in samples, design, and conduct introduce more statistical heterogeneity between studies, which can reasonably be attributed to random error within studies. In that case it might be more reasonable to use a random effects model. A random effects model assumes the studies are a sample from all possible studies and includes an additional variance component for variation or heterogeneity between studies.

The next step would be to try to understand different bias in the study like selection bias, publication bias.

Fig 1. Funnel plot

We can use this funnel plot from the data to assume that there is a possible publication bias (Publication bias is one kind of reporting bias).

  • Methods:
  • The first level of the hierarchical model assumes YjN(θj,σ2j).
  • For the exchangeability model we specify a second level of variance as θjN(μ,τ2) , with the θj’s drawn from a normal distribution with hyperparameters μ and τ2, which is the variation in the population-level effects. θj is a compromise between the data (Yj) and the prior.
  • Prior’s are μ~N(0, 105)  Which translates to a 95% certainty that true value of μ lies between -1.96×316 and +1.96×316 (316 comes from sqrt(105)) and
  • And τ2~ U(0,10) since it can’t be allowed to be negative,

Yj and σ2j can be shown as

,

            To understand the τ2 (tau – Measures of between-study heterogeneity) and σ2 (sigma – variance within the studies) let’s look at the forest plot

Fig 2. Forest plot.

We can see the point estimate in each analysis and weightage given to each analysis on trying to understand the overall variance. Here the diamond is the overall effect, we see that it is heavily influenced by Auckland study.

The x-axis forms the effect size scale, plotted on the top of the plot. Each row, except the bottom one, represents a study’s effect size estimate in the form of a point and a (95%) confidence interval. The confidence interval of the combined effect size in Fig 2 does not include zero, i.e., in case of a

confidence level of 95% the p-value is smaller than .05. In traditional terminology, this means that the

meta-analytic effect is statistically significant.

  • Computation:

            The model was fitted in R using the rjags package. The jags model code is given in the appendix. The models were run with following specifications: burn = 10000, 2 chains, 10000 iterations per chain, and no thinning. The model successfully converged for all two priors as judged from trace plots (see Figure 3 for representation). Our model converged pretty well.

            The second part of the model is simulation study to evaluate the frequentist Type I error of the method (code is given in the appendix). For each iteration same configuration are used in the JAGS model (burn = 10000, 2 chains, 10000). A parameter “point_est” keeps track of the model summary if experiment group is different from the control group.

Fig 3. Diagnostic plot for variable OR.

We also look at DIC to see how well the model has fit, lower values of DIC indicate a better model fit. The DIC value shows a mean deviance of 14.77 with a penalty of 2.932. Which implies that it is a good fit. Gelman and Rubin’s Convergence Diagnostic were used to test the convergence and they were all less than 1.2.

  • Result:

            Below is the result of the JAGS model. Although there are different outcomes of meta-analysis we could be looking at, but for our analysis we will look at the point estimate. Here parameter d and delta[j] is in logit scale. OR is the odds ratio, and it is in the positive scale and does not include 0 in its credible interval proving that there is a treatment effect.

Variable Name Mean SD Naive SE Time-series SE
OR 2.4912 1.2324 0.0087147 0.024413
d 0.8289 0.3944 0.002789 0.008488
delta[1] 0.6862 0.3697 0.0026141 0.007632
delta[2] 1.0146 0.6033 0.0042661 0.012662
delta[3] 0.9713 0.5153 0.003644 0.010262
delta[4] 0.6393 0.4356 0.0030805 0.00825
delta[5] 0.8779 0.5153 0.0036439 0.008662
delta[6] 1.0523 0.6109 0.0043197 0.013568
delta[7] 0.5495 0.4963 0.0035094 0.01056
prob.OR1 0.986 0.1175 0.0008308 0.001212
tau 0.5297 0.4567 0.0032293 0.017661

Table 1. Result of the jags model.

            We can also look at the parameter prod.OR1 with which we can get the probability of the synthesized estimate being below (or above) some value (probability of OR > 1). Here it is close to 1.

            The second part of the result is for the simulation study. Total number of iterations that concludes that there is a difference between treatment groups are taken and proportion is calculated. The value that we obtained is 0.059. Which is the Type I error of this method. This is acceptable for this model. This lies around the acceptable range and it is the decision of the researcher to accept this score.

  • References:
  • Andrew Gelman, John B. Carlin, Hal S Stern, David B. Dunson, Aki Vehtari, and Donald B. Rubin: Book on Bayesian Data Analysis (third edition) – Chapter 5 section 6 Hierarchical modeling applied to a meta-analysis
  • Harrer, M., Cuijpers, P. & Ebert, D. D. (2019). Doing Meta-Analysis in R: A Hand-on Guide. https://bookdown.org/MathiasHarrer/Doing_Meta_Analysis_in_R/.

Appendix

JAGS Code:

data(cochrane)

Y <- log(cochrane[,4]/(cochrane[,5]- cochrane[,4]))-log(cochrane[,2]/(cochrane[,3]- cochrane[,2]))

V <- sqrt((1/cochrane[,4]) + (1/(cochrane[,5]-cochrane[,4])) + (1/cochrane[,2]) + (1/(cochrane[,3]-cochrane[,2])))

model_string <- textConnection(“model {

                               # Likelihood

                               for (j in 1:n) {

                               P[j] <- 1/V[j]      # Calculate precision

                               Y[j] ~ dnorm(delta[j],P[j])

                               delta[j] ~ dnorm(d,prec)

                               }

                               # Priors

                               d ~ dnorm(0,1.0E-6)

                               prec <- 1/tau.sq

                               tau.sq <- tau*tau   # tau.sq = between-study variance

                               tau ~ dunif(0,10)   # Uniform on SD

                               OR<-exp(d)          #exponentiate to get back to OR scale

                               prob.OR1<-step(d)   #calc prob of OR > 1

                               }”)

model <- jags.model(model_string,data = list(Y=Y,V=V,n=Nstud), n.chains=2,quiet=TRUE)

update(model, 10000, progress.bar=”none”)

params  <- c(“d”,”tau”,”OR”, “prob.OR1″,”delta”)

samples <- coda.samples(model,

                        variable.names=params,

                        n.iter=10000, progress.bar=”none”)

Simulation Study:

nsims  <- 1000      # Number of simulated datasets

m <- c(rep(0,nsims))

s_d <- c(rep(0,nsims))

t <- c(rep(0,nsims))

point_est <- c(rep(0,nsims))

library(rjags)

 Y <- matrix(0, nrow = nsims, ncol = 7)

 V <- matrix(0, nrow = nsims, ncol = 7)

 for(sim in 1:nsims){

        samp1 <- c(rep(0,7))

        samp0 <- c(rep(0,7))

         for(i in 1:7){

           samp1[i] <- rbinom(1,cochrane[i,5],0.1) + 0.5

           samp0[i] <- rbinom(1,cochrane[i,3],0.1) + 0.5

         }

        Y[sim,] <- log(samp1/(cochrane[,5]-samp1))-log(samp0/(cochrane[,3]-samp0))

        V[sim,] <- sqrt((1/samp1) + (1/(cochrane[,5]-samp1)) + (1/samp0) + (1/(cochrane[,3]-samp0)))

         model_string <- textConnection(“model {

                                    # Likelihood

                                   for (j in 1:n) {

                                   P[j] <- 1/V[j]      # Calculate precision

                                   Y[j] ~ dnorm(delta[j],P[j])

                                   delta[j] ~ dnorm(d,prec)

                                   }

                                   # Priors

                                   d ~ dnorm(0,1.0E-6)

                                   prec <- 1/tau.sq

                                   tau.sq <- tau*tau   # tau.sq = between-study variance

                                   tau ~ dunif(0,10)   # Uniform on SD

                                   OR<-exp(d)          #exponentiate to get back to OR scale

                                   prob.OR1<-step(d)   #calc prob of OR > 1

                                   }”)

    model <- jags.model(model_string,data = list(Y=Y[sim,],V=V[sim,],n=7), n.chains=2,quiet=TRUE)

    update(model, 10000, progress.bar=”none”)

    params  <- c(“d”,”tau”,”OR”, “prob.OR1”)

    samples <- coda.samples(model,

                            variable.names=params,

                            n.iter=10000, progress.bar=”none”)

    m[sim] <- summary(samples)[[1]][2,][1]

    s_d[sim] <- summary(samples)[[1]][2,][2]

    t[sim] <- summary(samples)[[1]][4,][1]

    if(m[sim] – s_d[sim] < 0 && m[sim] + s_d[sim] > 0)

    {

     # When summed up we get Type I error

      point_est[sim] = 1

    }

}

Leave a Reply

Your email address will not be published. Required fields are marked *