Computing the semi-partial coefficient for a predictor in a multiple regression

The semi-partial correlation for a predictor in a regression is the signed square root of the difference in R²s between a model with and without that term provided the term of interest is not included in any higher order interactions.

For example consider a model using prestige score, gender and income to predict suicide rates. We fit all main effects and the prestige by gender and prestige by income interactions.

The semi-partial r for prestige by gender is the square root of R² for the full model minus the R² for the model minus the prestige by gender interaction. (Similarly for the income by gender interaction). The semi-partial r for income is the square root of R² for the model containing all three main effects but no interactions minus the R² for the model containing prestige and group only. (Similarly for the other main effects).

The R program below will compute the signed semi-partial correlation for terms in a multiple regressions containing any number of continuous predictors and any number of two-way interactions involving the factor and each continuous predictor. Thanks to Adam Wagner for the program. Semi-part (also called part) correlations may also be obtained in SPSS by running a linear regression and choosing 'Part and partial correlations' after clicking on the 'Statistics' button.

############################################################################################
# Function calculates r from partial r squared # Assumes:
# - stringr package loaded
# - There are no factors with more than two levels... and ALL two-way interactions involving one factor are fitted
# Not sure what is coded works in the presence of interactions of greater order than 2 

library(stringr)
calcR <- function(mod, 
                  check=T
# Check that labels line up correctly){
  # Stops the function if any factor has more than two levels
  if(any(sapply(mod$xlevels, length)>2)){
    stop("Some factor in the model has more than 2 levels; this function won't work properly")
  }
  # Get the r.squared of a model
  r2 <- function(modR2)summary(modR2)$r.squared
  # Extract the names of the terms in the model
  termLabels <- attr(terms(mod), "term.labels")
  termSigns <- ifelse(coef(mod)[2:length(coef(mod))]>0, 1, -1)
    
  # Refit the model length(termLabels) times, where each fit excludes
  #   one term (and any interactions associated with that term)
  modDroppedTerms <- lapply(termLabels, 
                            function(term){
                              termsToDrop <- termLabels[str_detect(termLabels, term)]                              
                              update(mod, paste(".~.", paste("-", termsToDrop, sep="", collapse="")))})
  
  # Find comparison models for looking at change in proportion variance explained
  #   Essentially, Where there are interactions involving the term being tested, drop them from the model
  compModels <- lapply(termLabels, 
                       function(term){
                         intsToDrop <- c(termLabels[str_detect(termLabels, paste(term, ":", sep=""))],
                                         termLabels[str_detect(termLabels, paste(":", term, sep=""))])
                         if(length(intsToDrop)==0) return(mod)
                         else return(update(mod, 
                                            paste(".~.", paste("-", intsToDrop, sep="", collapse=""))))
                       })
  
  # Find the R^2 associated with each term
  modDroppedR2 <- sapply(modDroppedTerms, r2)
  compModelR2  <- sapply(compModels, r2)
    
  # Calculate the R^2 associated with each term
  termsR        <- sqrt(compModelR2-modDroppedR2)
  names(termsR) <- termLabels
  
  # Determine if r is +ve or -ve
  # Note: I'm not sure the terms will always line up, so worth checking
  if(check){
    cat("Check the names line up\n")
    print(data.frame(termLabels, names(termSigns)))
  }
  
  termsR <- termsR*termSigns
  termsR

MRC CBU Wiki

Quick Links

Search Wiki

Page Tools

Computing the semi-partial coefficient for a predictor in a multiple regression