Combining p-values by Stouffer's (preferred) and Fisher's (legacy) methods

Combining p-values by Stouffer's method

The following MATLAB code may be used to perform Stouffer's method.

function pcomb = stouffer(p)
% Stouffer et al's (1949) unweighted method for combination of 
% independent p-values via z's 
    if length(p)==0
        error('pfast was passed an empty array of p-values')
        pcomb=1;
    else
        pcomb = (1-erf(sum(sqrt(2) * erfinv(1-2*p))/sqrt(2*length(p))))/2;
    end

Note the below performs Stouffer's method in R assuming p-values are entered into a vector p e.g. p <- c(0,1,0.2,0.01).

erf <- function(x) 2 * pnorm(2 * x/ sqrt(2)) - 1
erfinv <- function(x) qnorm( (x+1)/2 ) / sqrt(2)
pcomb <- function(p) (1-erf(sum(sqrt(2) * erfinv(1-2*p))/sqrt(2*length(p))))/2
pl <- NA
pl <- length(p)
{ if (is.na(pl)) { res <- "There was an empty array of p-values"} 
else 
res <- pcomb(p) }
print(res)

A spreadsheet can also be used to compute Fisher's and Stouffer's combined p.

Combining p-values by Fisher's method

The basic idea is that if $$p_i (i=1 \ldots n)$$ are the one-sided $$p$$-values for $$n$$ independent statistics then $$-2 \sum\log(p_i)$$ is a $$\chi^2(2n)$$ statistic which reflects whether the combined $$p$$-values are smaller than would be expected if they were Uniform(0,1) variates.

The following MATLAB code evaluates this statistic and its p-value.

function p = pfast(p)
% Fisher's (1925) method for combination of independent p-values
% Code adapted from Bailey and Gribskov (1998)
    product=prod(p);
    n=length(p);
    if n<=0
        error('pfast was passed an empty array of p-values')
    elseif n==1
        p = product;
        return
    elseif product == 0
        p = 0;
        return
    else
        x = -log(product);
        t=product;
        p=product;
        for i = 1:n-1
            t = t * x / i;
            p = p + t;
        end
    end  

Let's try it out:

>> pvals=[0.1 0.01 0.01 0.7 0.3 0.1];
>> pfast(pvals)

ans =

    0.0021

I.e. the combined p-value is 0.0021 for this array of 6 $$p$$-values.

Further investigations suggest that Fisher's method has inappropriate behaviour. [examples to be included]

This method may also be performed using R code.

van Assen, van Aert and Wicherts (2015) give a formula based upon Fisher's method for summing p-values from studies a meta-analysis and comparing this sum to a Gamma distribution to assess for publication bias.

Manolov R and Solanas A (2012) suggest performing a binomial test to see if more statistically significant results (p < alpha) occur than would be expected assuming the null probability of a significant result is alpha (e.g. 0.05). The test can be evaluated routinely performed in most packages.

For example, suppose as in Manolov and Solanas (2012, p.505) 3 out of 10 studies, or cases, have a p-value less than 0.05 for the same test statistic. We can input a single column (count) with 3 '1's and 7 '0's and compare this to a binomial test with P(a significant result)=0.05 denoted by 'testvalue' in the below syntax (rather than the default of 0.5). This can be done using the SPSS syntax below.

NPTESTS 
  /ONESAMPLE TEST (count) BINOMIAL(TESTVALUE=0.05 SUCCESSCATEGORICAL=FIRST SUCCESSCONTINUOUS=CUTPOINT(MIDPOINT)) 
  /MISSING SCOPE=ANALYSIS USERMISSING=EXCLUDE 
  /CRITERIA ALPHA=0.05 CILEVEL=95.

This gives a p=0.012 agreeing with the value on page 505 of Manolov and Solanas. That is, the probability of obtaining equal or more extreme numbers of p-values than that observed equals P(three or fewer p-values < 0.05) + P([10-3=] seven or more p-values < 0.05) = 0.012.

References

Bailey TL and Gribskov M (1998). Combining evidence using p-values: application to sequence homology searches. Bioinformatics, 14(1) 48-54.

Fisher RA (1925). Statistical methods for research workers (13th edition). London: Oliver and Boyd.

Manolov R and Solanas A (2012). Assigning and combining probabilities in single-case studies. Psychological Methods 17(4) 495-509. Describes various methods for combining p-values including Stouffer and Fisher and the binomial test.

Stouffer, Samuel A., Edward A. Suchman, Leland C. DeVinney, Shirley A. Star, and Robin M. Williams, Jr. (1949). Studies in Social Psychology in World War II: The American Soldier. Vol. 1, Adjustment During Army Life. Princeton: Princeton University Press.

van Assen MALM, van Aert RCM and Wicherts JM (2015) Meta-analysis using effect size distributions of only statistically significant studies. Psychological Methods 20(3) 293-309.