Obtaining frequencies using string variables in SPSS
Supposing we have a list of particular messages which can be categorised into various types and rates of form with rates abbreviated to W (per week), D (per day), Y (per year) and M (per month) and wish to tabulate frequencies per week of each message type. There are also one-off messages.
Type |
Rate |
|
Appointment |
3-W |
|
appointment |
2-D |
|
chore |
1Y |
|
Medication |
2-M |
Notice that type names start with either capitals or lower case letters (which SPSS would regard as separate categories) and times are expressed both with and without hyphens.
The first thing we need to do is make sure types with the same spelling are all regarded as representing the same group regardless of case. This can be done using the string, substring and upcase commands.
STRING TYPE2 (A10). COMPUTE TYPE2 = UPCASE(SUBSTR(TYPE,1,5)). EXE.
We then need to create a single column representing the number of times a particular message was sent in a week. This is done by decoupling the number from each rate and saving this as a numerical variable.
STRING COUNT(A1). COMPUTE COUNT=SUBSTR(FREQUENCY,1,1). EXE. RECODE COUNT (CONVERT) INTO COUNTN. EXE.
The number of times each particular message is sent per week can then be computed by looking for a particular character string and multiplying the number we have just obtained by a quantity dependent on this string. For example a message signified by '4-W' = 4x7 = 28 messages sent per week.
This is achieved using the DO IF command to create the number of times each message is sent per week (NPW) . I also use the RND function to round to the nearest integer to preclude SPSS incorrectly adding column totals together or producing subgroups whose sums do not add to the total group frequency. For example the total number of messages recalled by people using a mobile prompter added to the total number of messages recalled by people using a pager may not add up to the the total number of messages despite all messages being recorded via pager or mobile. This anomaly stems from SPSS rounding errors since SPSS only outputs frequencies per message and total frequency to the nearest integer.
DO IF INDEX(UPCASE(FREQUENCY),"D ")>0. COMPUTE NPW = RND(COUNTN*7). COMPUTE DAILY =1. ELSE IF INDEX(UPCASE(FREQUENCY),"W ")>0. COMPUTE NPW = RND(COUNTN). COMPUTE DAILY=0. ELSE IF INDEX(UPCASE(FREQUENCY),"-Y ")>0. COMPUTE NPW = RND(COUNTN/52). COMPUTE DAILY=0. ELSE IF INDEX(UPCASE(FREQUENCY),"YEAR ")>0. COMPUTE NPW = RND(COUNTN/52). COMPUTE DAILY=0. ELSE IF INDEX(UPCASE(FREQUENCY),"M ")>0. COMPUTE NPW = RND(COUNTN/4). COMPUTE DAILY=0. ELSE IF INDEX(UPCASE(FREQUENCY),"OFF")>0. COMPUTE NPW = RND(COUNTN/52). COMPUTE DAILY=0. END IF. EXE.
There may be a few cases where a rate has been expressed in a non-standard way (such as 2-D Mon-Thurs) has been entered and these will need to be entered manually. These can be identified by using the SORT CASES command which will sort the number per week (NPW) variable such that the cases at the top of the file will represent the non-standard rates.
SORT CASES BY NPW. EXE.
We need to do one final thing before we can obtain frequencies for each message type. If we ask for frequencies we will only obtain the number of different messages of a particular type. To take into account repeated messages we use the weight command to tell SPSS how many times per week each message is repeated.
weight by NPW. frequencies vars=type2. exe.
We can also obtain the number of messages sent per week per client in the same way.
weight by NPW. frequencies vars=Client_ID. exe.
When we have finished we swithc off the weight so that each row now returns to representing just one frequency.
WEIGHT OFF.
* DEFINE PAGER OR MOBILE MESSAGE USER
We can also obtain two-way frequency classifications for subgroups of types of data (e.g. for mobile users and pager users) using the SORT CASES and SPLIT FILE commands which reformat the data so that it duplicates whatever syntax commands we issue. As with the WEIGHT command we also need to switch this facility off when we have finished using it.
SORT CASES BY PAGER_CLIENT. EXE. SPLIT FILE BY PAGER_CLIENT. weight by NPW. frequencies vars=type2. exe. WEIGHT OFF. SPLIT FILE OFF.