Diff for "FAQ/ssq" - CBU statistics Wiki
location: Diff for "FAQ/ssq"
Differences between revisions 10 and 32 (spanning 22 versions)
Revision 10 as of 2012-05-31 11:01:33
Size: 2085
Editor: PeterWatson
Comment:
Revision 32 as of 2013-03-08 10:17:14
Size: 2835
Editor: localhost
Comment: converted to 1.6 markup
Deletions are marked like this. Additions are marked like this.
Line 3: Line 3:
The GLM terminology is described [http://www.fil.ion.ucl.ac.uk/~mgray/ here] in relation to using SPM. The GLM terminology is described [[http://www.fil.ion.ucl.ac.uk/~mgray/|here]] in relation to using SPM.
Line 15: Line 15:
__Note__: The use of Mean Square Error is equivalent to the 'Deviance' Scale Parameter Method using the default linear model (for a continuous response) under the 'Estimation' tab in the Generalized Linear Model (under 'analyze' in SPSS). The default setting for the Scale Parameter method in the 'Generalized Linear Model' is actually 'Log-likelihood' which yields a statistic which has a critical value following a chi-square distribution. This statistic is not usually used directly for regression/anova models with ''continuous'' responses but is incorporated as a denominator in a F ratio. It is, however, used directly for ''categorical'' responses which is why we tend to quote chi-square values, rather than F values, when assessing the influence of predictor variables on group responses. The 'Scaled Parameter' represents the formula used for assessing the error sums of squares for the model and can take various forms which are all outputted by SPSS by default. __Use of s in the Generalized Linear Models procedure in SPSS__
Line 17: Line 17:
Scaled Pearson = $$\sum_text{i} \frac{(\mbox{i-th residual}^text{2})}{\mbox{i-th predicted value}}$$ which is usally used for group outcome. The use of Mean Square Error (s) is equivalent to the 'Deviance' Scale Parameter Method using the default linear model (for a continuous response) under the 'Estimation' tab in the Generalized Linear Model (under 'analyze' in SPSS). The default setting for the Scale Parameter method in the 'Generalized Linear Model' is actually 'Maximum likelihood estimate' which yields a statistic which has a critical value following a chi-square distribution. This statistic is not usually used directly for regression/anova models with ''continuous'' responses but is incorporated as a denominator in a F ratio. It is, however, used directly for ''categorical'' responses which is why we tend to quote chi-square values, rather than F values, when assessing the influence of predictor variables on group responses. The 'Scaled Parameter' represents the formula used for assessing the error sums of squares for the model and can take various forms which are all outputted by SPSS by default in the 'Goodness of Fit' box.
Line 19: Line 19:
Pearson Chi-square = deviance = $$\sum_text{i} (\mbox{i-th residual}^text{2})$$ which is usually used in continuous outcome Some of these terms used by SPSS are explained further here which all relate to defining lack of fit and are then further used in the construction of standard errors of regression estimates:
Line 21: Line 21:
Log-likelihood Chi-square = $$\sum_text{i} \mbox{i-th observed value}ln(\frac{\mbox{i-th observed value}}{\mbox{i-th predicted value}}) - \mbox{i-th predicted value} ln(\frac{\mbox{i-th observed value}}{\mbox{i-th predicted value}})$$ Deviance = Pearson Chi-square = $$\sum_text{i} (\mbox{i-th residual}^text{2})$$ which is usually used in construction of F statistics for continuous outcome and is equivalent to the Residual Sum of Squares (RSS).
Line 23: Line 23:
which is usually used in continuous outcome Scaled Pearson = $$\sum_text{i} \frac{(\mbox{i-th residual}^text{2})}{\mbox{i-th predicted value}}$$ which is usually quoted for group outcome.

The maximum likelihood estimate option uses the Log-likelihood Chi-square (given in the 'Omnibus Test'box) = 2 (difference in log-likelihoods with and without predictors) where the log-likelihood equals -n/2 ln(2 Pi RSS/n) - n/2 when using a continuous outcome where n is the total sample size, RSS is defined above and Pi=3.14.

Although the log-likelihood chi-square is thus defined for a continuous response it is usually only quoted for group outcome although it can be incorporated and quoted as a goodness of fit measure for a continuous response in the form of information criteria (See the Correlation and Regression Grad Talk).

What does 's' denote in describing a General Linear Model (GLM) and a note on Generalized Linear Models in SPSS?

The GLM terminology is described here in relation to using SPM.

Examples of GLMs include linear regressions and analysis of variance and are of form.

Y = XB + error
or, in words,
Response = Prediction + residual

s is, therefore, the residual standard deviation which, for example, corresponds to the square root of the mean square error term in an analysis of variance.

Use of s in the Generalized Linear Models procedure in SPSS

The use of Mean Square Error (s) is equivalent to the 'Deviance' Scale Parameter Method using the default linear model (for a continuous response) under the 'Estimation' tab in the Generalized Linear Model (under 'analyze' in SPSS). The default setting for the Scale Parameter method in the 'Generalized Linear Model' is actually 'Maximum likelihood estimate' which yields a statistic which has a critical value following a chi-square distribution. This statistic is not usually used directly for regression/anova models with continuous responses but is incorporated as a denominator in a F ratio. It is, however, used directly for categorical responses which is why we tend to quote chi-square values, rather than F values, when assessing the influence of predictor variables on group responses. The 'Scaled Parameter' represents the formula used for assessing the error sums of squares for the model and can take various forms which are all outputted by SPSS by default in the 'Goodness of Fit' box.

Some of these terms used by SPSS are explained further here which all relate to defining lack of fit and are then further used in the construction of standard errors of regression estimates:

Deviance = Pearson Chi-square = $$\sum_text{i} (\mbox{i-th residual}^text{2})$$ which is usually used in construction of F statistics for continuous outcome and is equivalent to the Residual Sum of Squares (RSS).

Scaled Pearson = $$\sum_text{i} \frac{(\mbox{i-th residual}^text{2})}{\mbox{i-th predicted value}}$$ which is usually quoted for group outcome.

The maximum likelihood estimate option uses the Log-likelihood Chi-square (given in the 'Omnibus Test'box) = 2 (difference in log-likelihoods with and without predictors) where the log-likelihood equals -n/2 ln(2 Pi RSS/n) - n/2 when using a continuous outcome where n is the total sample size, RSS is defined above and Pi=3.14.

Although the log-likelihood chi-square is thus defined for a continuous response it is usually only quoted for group outcome although it can be incorporated and quoted as a goodness of fit measure for a continuous response in the form of information criteria (See the Correlation and Regression Grad Talk).

None: FAQ/ssq (last edited 2013-03-08 10:17:14 by localhost)