FAQ/cbsplines - CBU statistics Wiki
Self: FAQ/cbsplines

Interpolation using cubic splines to find the location of a changepoint

Splines are polynomials which are glued together at one or more changepoints to form a curve of varying smoothness (specified using df in the R code below - the more wiggly the curve the more df you will need to capture all its twists and turns) across the range of the time interval in the data. These polynomials are then used as predictors in linear regressions to produce estimates of the response and curves. Cubic polynomials (natural) splines are particularly popular and are often fitted in this way. This is done using the ns procedure in R 2.10.

The ns procedure (an example of which is below) estimates a particular form of natural cubic spline, a B-spline. B-splines are splines that form basis vectors (hence the 'B') representing distinct aspects of the relationship between each pair of co-ordinates x and y. Linear combinations of these are used to predict y - the more that are used the more accurate the prediction of y and if all the basis vectors are used y is perfectly predicted by x.

library(stats)
library(graphics)
library(splines)
plot(x,y)

x <- c(0,1,2,3,4,5,6)
y <- c(2,3,6,12,15,13,11)

ns(x, df=2)
summary(fm1 <- lm(y ~ ns(x, df=2)) )
ht <- seq(1,6, length.out=200)
lines(ht,predict(fm1, data.frame(x=ht)))

predict(fm1, data.frame(x=ht))

By specifying a df of 2 we allow just enough to capture the plateau nature of the curve. This flattens out where the change (corresponding to the first derivative of the fitted function) between successive interpolations equals zero. Using the predict function which interpolates the response, y, in equal intervals of a given length in a range (in our case 200 equal sized intervals between times 1 and 6) above we see that the change between points 143 and 144 is 0.0001 which suggests the levelling off occurs at 144/200 * (6-0) at x =4.35.

None: FAQ/cbsplines (last edited 2013-03-08 10:17:23 by localhost)