power |
Power calculations for general linear models |
power |
SAS Macro Programs: power
$Version:
Michael Friendly
York University
Power calculations for general linear models
The power macro carries out retrospective power analysis
(using the results from a given model fit to actual data)
and/or prospective power analysis
(using parameters specified by the user)
for any general linear model which can be fit by the GLM procedure.
It
calculates the following power related measures:
- Effect size
- Power for an effect test
- Adjusted power and confidence limits
- Least significant number - the smallest sample size required
for the effect to be significant.
- Power for least significant number
The power macro was written by Kristin Latour,
and modified somewhat to make it more consistent with
my rpower macro.
Her original documentation is available as a PostScript
file, PowerMacro.ps,
and PDF PowerMacro.pdf.
Method
The program reads as input an OUTSTAT=
data set from PROC GLM, containing the sums of squares and
degrees of freedom for all effects in the model.
For each effect specified in the EFFECT= parameter,
the program calculates the effect size (DELTA),
which in turn is used to calculate power.
If values are specified for the N=, SIGMA=, and DELTA=
parameters, all combinations of values of these are appended
to those in the data, and power values are calculated for these
as well.
Usage
power is a macro program. Values must be supplied for the EFFECT=
parameter
The arguments may be listed within parentheses in any order, separated
by commas. For example:
%power(data=inputdataset, out=outputdataset, ..., )
Parameters
Default values (if any) are shown after the name of each parameter.
- DATA=_LAST_
- OUTSTAT= data set from GLM. If not specified, the most
recently created dataset is used.
- OUT=_POWER_
- The name of the output dataset. If not specified, the new dataset is named _POWER_.
- EFFECT=
- Specifies the name of one or more effects,
given in the _SOURCE_ variable of the input dataset. Specifying
EFFECT=_ALL_ will calculate power for all effects in the OUTSTAT=
data set.
- CALCS=POWER LSN ADJPOW
- Specifies the calculations to report, any one or more of the
following:
- POWER - the nominal (unadjusted) power of the test
- ADJPOW - power, adjusted for bias
- POWCI - lower and upper confidence limits for the power
- LSN - least significant N, that is, the smallest sample size required
for the effect to be significant at significance level ALPHA
- SS=SS3
- the type of sums of squares to use - either SS1 or SS3
- ALPHA=.05
- list of significance levels,
- N=
- list of sample sizes, in addition to that contained
in the OUTSTAT= data set.
- SIGMA=
- list of standards deviations, in addition to that contained
in the OUTSTAT= data set.
- DELTA=
- list of effect sizes, in addition to that contained
in the OUTSTAT= data set.
Details
Definitions of terms that are relevant to the %POWER macro:
- Prospective power analysis:
- Used in the planning phase of a designed
experiment to determine how large the sample size must be to detect
an effect of a given size (such as the minimum difference between
treatment effects that is of practical value).
- Retrospective power analysis:
- Used after the analysis of an
experiment to determine the power of the conducted test.
- Power:
- Is the probability that a false null hypothesis will be rejected.
Ideally you would design your experiment to be as powerful as possible
at detecting hypotheses of interest. Values of power range from 0 to 1,
where values near 0 are low power and values near 1 are high power.
Power is a function of the sample size (N), the effect size (delta), the
root mean square error (sigma), and the significance level (alpha). The
power tells you how likely your experiment is to detect a given
difference, delta, at a given significance level, alpha. Power has the
following characteristics:
- If the true value of the parameter is the hypothesized value, the
power should be alpha. You do not want to reject the null hypothesis
when it is true.
- If the true value of the parameters is not the hypothesized value,
you want the power to be as large as possible.
- The power increases with the sample size. The power increases as
variance decreases. The power increases as the true parameter gets
farther from the hypothesized value.
- Adjusted Power:
- Is for retrospective power analyses. The adjusted power
is smaller than the power, as it removes the bias associated with the
noncentrality parameter. The noncentrality paramater is biased for
any value other than zero. Because power is a function of
population quantities that are not known, the usual practice is to
substitute sample estimates in power calculations. If you regard
these sample estimates as random, you can adjust them to have a
more proper expectation. You can also construct a confidence
interval for this adjusted power, though it is often very wide. The
adjusted power and confidence interval can only be computed for your
observed effect size, delta.
- The Least Significant Number (LSN)
- is the number of observations needed
to reduce the variance of the estimates enough to achieve a significant
result with the given values of alpha, sigma, and delta. If you need
more data to achieve significance, the LSN helps tell you how many more.
The LSN has the following characteristics:
- If the LSN is less than the actual sample size N, then the
effect is significant. This means that you have more data than
you need to detect the significance at the given alpha level.
- If the LSN is greater than the actual sample size N, the effect
is not significant. In this case, if you believe that more data
will show the same variance and structural results as the current
sample, the LSN suggests how much data you would need to achieve
significance.
- If the LSN is equal to N, then the p-value is equal to the
significance level, alpha. The test is on the border of significance.
- Power calculated when N=LSN is always greater than or equal to 0.5.
- Power when N=LSN
- represents the power associated with using the N
recommended by the LSN.
- The noncentrality parameter, lambda,
- is N*delta^2/sigma^2, where N is
the total sample size, delta is the effect size, and sigma^2 is the mean
square error. Note that the noncentrality parameter is zero when the
null hypothesis is true, that is, when the effect size is zero.
- The Effect Size, delta,
- is estimated from the data as
sqrt[ SS(Hypothesis)/N ]. The effect size can be thought of as the
minimum difference in means that you want to detect divided by the total
sample size.
Limitations
The %POWER macro is appropriate for fixed-effect linear models fit by
PROC GLM only. IT IS NOT APPROPRIATE FOR PROC GLM MODELS USING THE
RANDOM, TEST, REPEATED, OR MANOVA STATEMENTS.
The %POWER macro does not accept a given power value as input and report
the required sample size. However, using the N= parameter you can
try several different sample sizes to see the effect on power.
Example
%include macros(power); *-- or include in an autocall library;
Title 'Sucrose Data: Speed to traverse a Runway';
data sucrose;
label SUGAR = 'Sucrose Concentration (%)'
SPEED = 'Speed in Runway (ft/sec)';
input SUGAR @ ;
do SUBJECT = 1 to 8;
input SPEED @;
output;
end;
cards;
8 1.4 2.0 3.2 1.4 2.3 4.0 5.0 4.7
16 3.2 6.8 5.0 2.5 6.1 4.8 4.6 4.2
32 6.2 3.1 3.2 4.0 4.5 6.4 4.5 4.1
64 5.8 6.6 6.5 5.9 5.9 3.0 5.9 5.6
;
*-- Do the ANOVA, obtain OUTSTAT= data set;
proc GLM data=SUCROSE outstat=STATS;
class SUGAR;
model SPEED = SUGAR / SS3;
contrast 'Linear' SUGAR -3 -1 1 3;
contrast 'Quad ' SUGAR 1 -1 -1 1;
contrast 'Cubic ' SUGAR -1 3 -3 1;
run;
%power(data=stats, alpha=.01, effect=_all_);
The following output is produced:
Power values and estimated sample sizes
IF population means = sample means
OBS _NAME_ _SOURCE_ _TYPE_ DF SS F PROB
1 SPEED ERROR ERROR 28 47.760 . .
2 SPEED SUGAR SS3 3 28.680 5.6047 0.00386
3 SPEED Linear CONTRAST 1 24.336 14.2673 0.00076
4 SPEED Quad CONTRAST 1 0.500 0.2931 0.59250
5 SPEED Cubic CONTRAST 1 3.844 2.2536 0.14450
Power Calculation for effect(s) SUGAR LINEAR QUAD CUBIC
Type I Total Root Mean Least Power
Error Sample Square Effect Power of Adjusted Significant when
Source Rate Size Error Size Test Power Number N=LSN
SUGAR 0.01 35 1.31 0.9052 0.72851 0.55453 30 0.60884
LINEAR 0.01 35 1.31 0.8339 0.83254 0.75935 22 0.53215
QUAD 0.01 35 1.31 0.1195 0.02038 0.01000 802 0.50085
CUBIC 0.01 35 1.31 0.3312 0.12164 0.05524 108 0.50338
See also
fpower Power computations for ANOVA designs
meanplot Plotting means for factorial designs
mpower Retrospective power analysis for multivariate GLMs
mpower Retrospective power analysis for univariate GLMs
stat2dat Convert summary dataset to raw data equivalent