FAQ/FVars - CBU statistics Wiki

Upload page content

You can upload content for the page named below. If you change the page name, you can also upload content for another page. If the page name is empty, we derive the page name from the file name.

File to load page content from
Page name
Comment
Type the odd letters out: scieNce GATHeRS knowledge fAster tHAN SOCIeTY GATHErS wisdom

location: FAQ / FVars

Variance of a transformed mean

It is sometimes necessary to transform data to, for example, downweight the influence of outliers, prior to performing any analysis. The reciprocal of reaction times is used for this purpose.

A transformed mean of m, m', with variance s'2 on a sample of size, n, has a backtransformed variance (ie on the original scale) given below obtained using the delta method.

Note: Please ignore the '^' signs in the second column of the below table. These appear to be needed, for some reason, to format the table below.

F(m)

$$\mbox{F}-1$$ (m')

$$\mbox{Variance } \mbox{F}^text{-1}(\mbox{m'})$$

Ln(m)

em'

(e2m' s'2 ) / n

1/m

1/m'

s'2 / (m'4 n)

$$\sqrt{\mbox{m}}$$

m'2

[(2m's')2 ]/n

$$2\mbox{ arcsine } \sqrt{m}$$

$$(\mbox{sin(m'/2}))2$$

( (cos(m'/2)sin(m'/2))2 s'2 ) /n

Note for the arcsine transform, if using packages such as SPSS, the calculation in performed in radians rather than degrees (the default on calculators).

Ordinarily when using power transforms we transform before taking the mean e.g. taking logs of raw data and then taking means of these logged values rather than averaging the raw data first and logging the resultant mean (See the Exploratory Data Analysis Graduate Statistics talk here).

Note that we usually use ln (log to the base e) which is preferred to log10 for interpretability - see here. If this link is broken the details are reproduced here.

As an example of the ease of interpreting the ln (natural log) function suppose we use reading score to predict ln(writing score) and reading score is found to have a regression coefficient of 0.0066305. This indicates that for a ten-unit increase in read, we expect to see about a 6.9% increase in writing score, since exp(.0066305*10) = 1.0685526.

A simpler example of interpretation of ln scores in a regression of y on x

Suppose we regress x on ln(y) and find x has a regression coefficient of 0.06 then

predicted ln(y2) = 0.06*(x+1) and predicted ln(y1) = 0.06*x and the difference in predicted ln(y2) and ln(y1) equals 0.06.

It then follows that ln(y2) - ln(y1) = ln(y2/y1) = 0.06 and so (y2/y1) = exp(0.06) = 1.06 so we have the useful result that a regression coefficient of 0.06 of x on y corresponds to a 6% increase in y for a unit increase in x. This is not the case using log10 since log10(0.06) does not equal 1.06.