Confidence intervals for proportions using SPSS syntax

*  ============================================================= .
*  File:        CI_for_proportion.SPS .
*  Date:         19-Nov-2012 .
*  Author:  Bruce Weaver, bweaver@lakeheadu.ca .
*  ============================================================= .

* Get confidence interval for a binomial proportion using:
   - Wald method
   - Adjusted Wald method (Agresti & Coull, 1998)
   - Wilson score method (identical to Ghosh's 1979 method)
   - Jeffreys method
.
* The data used here are from Table I in Newcombe (1998), Statistics
   in Medicine, Vol 17, 857-872.

DATA LIST LIST /x(f8.0) n(f8.0) confid(f5.3) .
BEGIN DATA.
81 263 .95
15 148 .95
0   20 .95
1   29 .95
81 263 .90
15 148 .90
0   20 .90
1   29 .90
81 263 .99
15 148 .99
0   20 .99
1   29 .99
16  48 .95
16  48 .99
END DATA.

compute alpha = 1 - confid.
compute p = x/n.
compute q = 1-p.
compute z = probit(1-alpha/2).

*  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ .

* Wald method (i.e., the usual normal approximation).

compute #se = SQRT(p*q/n).
compute Lower1 = p - z*#se.
if Lower1 LT 0 Lower1 = 0.
compute Upper1 = p + z*#se.
if Upper1 GT 1 Upper1 = 1.

*  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ .

* Adjusted Wald method due to Agresti & Coull (1998).

compute #p = (x + z**2/2) / (n + z**2).
compute #q = 1 - #p.
compute #se = SQRT(#p*#q/(n+z**2)).
compute Lower2 = #p - z*#se.
if Lower2 LT 0 Lower2 = 0.
compute Upper2 = #p + z*#se.
if Upper2 GT 1 Upper2 = 1.

*  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ .

* Wilson score method (Method 3 in Newcombe, 1998) .
* Code adapted from Robert Newcombe's code posted here:
    http://archive.uwcm.ac.uk/uwcm/ms/Robert2.html .

* The method of Ghosh (1979), as described in Glass & Hopkins
* (1996, p 326) is identical to Wilson's method.
* Glass & Hopkins describe it as the "method of choice for all values
   of p and n" .

COMPUTE #x1 = 2*n*p+z**2 .
COMPUTE #x2 = z*(z**2+4*n*p*(1-p))**0.5 .
COMPUTE #x3 = 2*(n+z**2) .
COMPUTE Lower3 = (#x1 - #x2) / #x3 .
COMPUTE Upper3 = (#x1 + #x2) / #x3 .

*  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ .

* Jeffreys method shown on the IBM-SPSS website at
* http://www-01.ibm.com/support/docview.wss?uid=swg21474963 .

compute Lower4 = idf.beta(alpha/2,x+.5,n-x+.5).
compute Upper4 = idf.beta(1-alpha/2,x+.5,n-x+.5).

*  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ .

* Format variables and list the results of all methods .

formats p q Lower1 to Upper4 (f5.4).
sort cases by p confid.

list var x n confid p Lower1 to Upper4 .

* Method 1:  Wald method (i.e., the usual normal approximation) .
* Method 2:  Adjusted Wald method (using z**2/2 and z**2 rather than 2 and
4).
* Method 3:  Wilson score method (from Newcombe paper), identical to Ghosh
(1979).
* Method 4:  Jeffreys method
(http://www-01.ibm.com/support/docview.wss?uid=swg21474963).

* Data from Newcombe (1998), Table I.

variable labels
 x "Successes"
 n "Trials"
 p "p(Success)"
 confid "Confidence Level"
 Lower1 "Wald: Lower"
 Upper1 "Wald: Upper"
 Lower2 "Adj Wald: Lower"
 Upper2 "Adj Wald: Upper"
 Lower3 "Wilson score/Ghosh: Lower"
 Upper3 "Wilson score/Ghosh: Upper"
 Lower4 "Jeffreys: Lower"
 Upper4 "Jeffreys: Upper"
.

SUMMARIZE
  /TABLES=x n p confid Lower1 Upper1 Lower2 Upper2 Lower3 Upper3 Lower4
Upper4
  /FORMAT=VALIDLIST NOCASENUM TOTAL
  /TITLE='Confidence Intervals for Binomial Proportions'
  /MISSING=VARIABLE
  /CELLS=NONE.

*  ============================================================= .