FAQ/mixtures - CBU statistics Wiki

Upload page content

You can upload content for the page named below. If you change the page name, you can also upload content for another page. If the page name is empty, we derive the page name from the file name.

File to load page content from
Page name
Comment
Type the odd letters out: scieNce GATHeRS knowledge fAster tHAN SOCIeTY GATHErS wisdom

location: FAQ / mixtures

Fitting finite Normal mixture models using SPSS and BMDP

Normal mixture models may be used to identify locations and spreads of suspected multiple peaks in a distribution for an apriori number of hypothesised normal distributions. The example below examines the possibility of data arising from a mixture of two Normal distributions.

The bimodal example data (consisting on one variable) is given in a SPSS data file here and clearly shows two bumps when viewed using a histogram. The data may be saved into a file called bimodal2.dat (excluding the variable name) from SPSS which can be entered into the statistical package BMDP.

This facility is not available in most statistical packages but it is supported by maximum likelhood routines in BMDP and STATA (Haughton 1997). Neither is currently available at CBSU, however you can use SPSS. A BMDP run using syntax below will fit two normal distributions. This syntax needs to be saved in a file, say, mlm.bmdp.

/ input         file= 'bimodal2.dat'.
                 variables=1.
                 format=free.

 / variable      names = bdat.

 / estimate      parameters=4.

 / parameter     names=mu, sigmasq, mu2, sigmasq2.
                 initial = 2, 0.5, 7, 2.

 / density       f = 0.5*exp(-(bdat-mu)**2/(2*sigmasq))/
                     sqrt(6.2832*sigmasq) +
                     0.5*exp(-(bdat-mu2)**2/(2*sigmasq2))/
                     sqrt(6.2832*sigmasq2).

 / end

To run the job we type the following which assumes the syntax is in file mlm.bmdp. Output is sent to a newly created file called mlm.out.

bmdp le mlm.bmdp mlm.out

The file, mlm.out, contains the means and standard deviations of the two normal distributions which "best" explain the data. In this example the best fitting normal distributions have a means of 1.99 and 6.72 with respective variances of 0.24 and 4.73.

The log-likelihood from the above fit may be compared with that assuming just one peak and the estimated density functions plotted to assess fit graphically.

References

BMDP Statistical Software Manual Volume 2 (1992) BMDP Statistical Software Inc.

Haughton, D. (1997) Packages for Estimating Finite Mixtures: A Review The American Statistician, 51 194-205.