FAQ/semiprinR - CBU statistics Wiki

Please enter your password of your account at the remote wiki below.
/!\ You should trust both wikis because the password could be read by the particular administrators.

Clear message
location: FAQ / semiprinR

Computing the semi-partial coefficient for a predictor in a multiple regression

The semi-partial correlation for a predictor in a regression is the signed square root of the difference in R2 s between a model with and without that term provided the term of interest is not included in any higher order interactions.

For example consider a model using prestige score, gender and income to predict suicide rates. We fit all main effects and the prestige by gender and prestige by income interactions.

The semi-partial r for prestige by gender is the square root of R2 for the full model minus the R2 for the model minus the prestige by gender interaction. (Similarly for the income by gender interaction). The semi-partial r for income is the square root of R2 for the model containing all three main effects but no interactions minus the R2 for the model containing prestige and group only. (Similarly for the other main effects).

The R program below will compute the signed semi-partial correlation for terms in a multiple regressions containing any number of continuous predictors and any number of two-way interactions involving the factor and each continuous predictor. Thanks to Adam Wagner for the program. Semi-part (also called part) correlations may also be obtained in SPSS by running a linear regression and choosing 'Part and partial correlations' after clicking on the 'Statistics' button.

############################################################################################
# Function calculates r from partial r squared # Assumes:
# - stringr package loaded
# - There are no factors with more than two levels... and ALL two-way interactions involving one factor are fitted
# Not sure what is coded works in the presence of interactions of greater order than 2 

library(stringr)
calcR <- function(mod, 
                  check=T
# Check that labels line up correctly){
  # Stops the function if any factor has more than two levels
  if(any(sapply(mod$xlevels, length)>2)){
    stop("Some factor in the model has more than 2 levels; this function won't work properly")
  }
  # Get the r.squared of a model
  r2 <- function(modR2)summary(modR2)$r.squared
  # Extract the names of the terms in the model
  termLabels <- attr(terms(mod), "term.labels")
  termSigns <- ifelse(coef(mod)[2:length(coef(mod))]>0, 1, -1)
    
  # Refit the model length(termLabels) times, where each fit excludes
  #   one term (and any interactions associated with that term)
  modDroppedTerms <- lapply(termLabels, 
                            function(term){
                              termsToDrop <- termLabels[str_detect(termLabels, term)]                              
                              update(mod, paste(".~.", paste("-", termsToDrop, sep="", collapse="")))})
  
  # Find comparison models for looking at change in proportion variance explained
  #   Essentially, Where there are interactions involving the term being tested, drop them from the model
  compModels <- lapply(termLabels, 
                       function(term){
                         intsToDrop <- c(termLabels[str_detect(termLabels, paste(term, ":", sep=""))],
                                         termLabels[str_detect(termLabels, paste(":", term, sep=""))])
                         if(length(intsToDrop)==0) return(mod)
                         else return(update(mod, 
                                            paste(".~.", paste("-", intsToDrop, sep="", collapse=""))))
                       })
  
  # Find the R^2 associated with each term
  modDroppedR2 <- sapply(modDroppedTerms, r2)
  compModelR2  <- sapply(compModels, r2)
    
  # Calculate the R^2 associated with each term
  termsR        <- sqrt(compModelR2-modDroppedR2)
  names(termsR) <- termLabels
  
  # Determine if r is +ve or -ve
  # Note: I'm not sure the terms will always line up, so worth checking
  if(check){
    cat("Check the names line up\n")
    print(data.frame(termLabels, names(termSigns)))
  }
  
  termsR <- termsR*termSigns
  termsR