stat2dat | Convert summary dataset to raw data equivalent | stat2dat |
The input dataset contains one observation for each group. Supply the names of variables containing the N, MEAN, and standard deviation (STD) for each group (see argument list below); The mean square error (MSE) for a reported ANOVA can be supplied instead of individual STD values. The sample size per cell can be supplied as a constant rather than a dataset variable if all groups are of the same size.
The output dataset can then be used with PROC GLM or PROC ANOVA (balanced designs). It contains all variables from the input dataset plus a constructed dependent variable ('Y' by default) and a constructed frequency variable ('FREQ' by default).
%stat2dat(data=inputdataset, out=outputdataset, ..., depvar=Y, freq=freq) proc glm data=outputdataset; class classvars; freq freq; model Y = modelterms;
title 'Effect of feeback on long-term recall'; * Keppel, 1991, p.434; data learning; do grade = 5,12; do words = 'low ', 'high'; do feedback = 'Control', 'Pos', 'Neg'; input mean @@; output; end; end; end; lines; 8.8 8.0 7.6 8.0 4.4 3.8 9.0 8.4 8.0 7.8 7.4 7.2 ;To reproduce his analysis, use STAT2DAT to generate equivalent raw data. Note that both N and MSE are given as constants, while MEAN refers to the dataset variable.
%stat2dat(data=learning, class=grade words feedback, out=raw, mean=mean, n=5, MSE=1.75, depvar=recall); proc print data=raw; run;The output dataset contains two "pseudo-observations" for each group, calculated so they have the same mean and standard deviation as the raw data.
OBS GRADE WORDS FEEDBACK MEAN RECALL FREQ 1 5 low Control 8.8 9.39161 4 2 5 low Control 8.8 6.43357 1 3 5 low Pos 8.0 8.59161 4 4 5 low Pos 8.0 5.63357 1 5 5 low Neg 7.6 8.19161 4 6 5 low Neg 7.6 5.23357 1 7 5 high Control 8.0 8.59161 4 8 5 high Control 8.0 5.63357 1 9 5 high Pos 4.4 4.99161 4 10 5 high Pos 4.4 2.03357 1 11 5 high Neg 3.8 4.39161 4 12 5 high Neg 3.8 1.43357 1 13 12 low Control 9.0 9.59161 4 14 12 low Control 9.0 6.63357 1 15 12 low Pos 8.4 8.99161 4 16 12 low Pos 8.4 6.03357 1 17 12 low Neg 8.0 8.59161 4 18 12 low Neg 8.0 5.63357 1 19 12 high Control 7.8 8.39161 4 20 12 high Control 7.8 5.43357 1 21 12 high Pos 7.4 7.99161 4 22 12 high Pos 7.4 5.03357 1 23 12 high Neg 7.2 7.79161 4 24 12 high Neg 7.2 4.83357 1The analysis is carried out with PROC GLM as follows:
proc glm data=raw; class feedback words grade; model recall = feedback|words|grade; freq freq;