power Power calculations for general linear models power

SAS Macro Programs: power

$Version:
Michael Friendly
York University



The power macro ( [download] get power.sas)

Power calculations for general linear models

The power macro carries out retrospective power analysis (using the results from a given model fit to actual data) and/or prospective power analysis (using parameters specified by the user) for any general linear model which can be fit by the GLM procedure. It calculates the following power related measures: The power macro was written by Kristin Latour, and modified somewhat to make it more consistent with my rpower macro. Her original documentation is available as a PostScript file, PowerMacro.ps, and PDF PowerMacro.pdf.

Method

The program reads as input an OUTSTAT= data set from PROC GLM, containing the sums of squares and degrees of freedom for all effects in the model. For each effect specified in the EFFECT= parameter, the program calculates the effect size (DELTA), which in turn is used to calculate power. If values are specified for the N=, SIGMA=, and DELTA= parameters, all combinations of values of these are appended to those in the data, and power values are calculated for these as well.

Usage

power is a macro program. Values must be supplied for the EFFECT= parameter

The arguments may be listed within parentheses in any order, separated by commas. For example:

   %power(data=inputdataset, out=outputdataset, ..., )

Parameters

Default values (if any) are shown after the name of each parameter.
DATA=_LAST_
OUTSTAT= data set from GLM. If not specified, the most recently created dataset is used.
OUT=_POWER_
The name of the output dataset. If not specified, the new dataset is named _POWER_.
EFFECT=
Specifies the name of one or more effects, given in the _SOURCE_ variable of the input dataset. Specifying EFFECT=_ALL_ will calculate power for all effects in the OUTSTAT= data set.
CALCS=POWER LSN ADJPOW
Specifies the calculations to report, any one or more of the following:
SS=SS3
the type of sums of squares to use - either SS1 or SS3
ALPHA=.05
list of significance levels,
N=
list of sample sizes, in addition to that contained in the OUTSTAT= data set.
SIGMA=
list of standards deviations, in addition to that contained in the OUTSTAT= data set.
DELTA=
list of effect sizes, in addition to that contained in the OUTSTAT= data set.

Details

Definitions of terms that are relevant to the %POWER macro:
Prospective power analysis:
Used in the planning phase of a designed experiment to determine how large the sample size must be to detect an effect of a given size (such as the minimum difference between treatment effects that is of practical value).
Retrospective power analysis:
Used after the analysis of an experiment to determine the power of the conducted test.
Power:
Is the probability that a false null hypothesis will be rejected. Ideally you would design your experiment to be as powerful as possible at detecting hypotheses of interest. Values of power range from 0 to 1, where values near 0 are low power and values near 1 are high power. Power is a function of the sample size (N), the effect size (delta), the root mean square error (sigma), and the significance level (alpha). The power tells you how likely your experiment is to detect a given difference, delta, at a given significance level, alpha. Power has the following characteristics:
Adjusted Power:
Is for retrospective power analyses. The adjusted power is smaller than the power, as it removes the bias associated with the noncentrality parameter. The noncentrality paramater is biased for any value other than zero. Because power is a function of population quantities that are not known, the usual practice is to substitute sample estimates in power calculations. If you regard these sample estimates as random, you can adjust them to have a more proper expectation. You can also construct a confidence interval for this adjusted power, though it is often very wide. The adjusted power and confidence interval can only be computed for your observed effect size, delta.
The Least Significant Number (LSN)
is the number of observations needed to reduce the variance of the estimates enough to achieve a significant result with the given values of alpha, sigma, and delta. If you need more data to achieve significance, the LSN helps tell you how many more. The LSN has the following characteristics:
Power when N=LSN
represents the power associated with using the N recommended by the LSN.
The noncentrality parameter, lambda,
is N*delta^2/sigma^2, where N is the total sample size, delta is the effect size, and sigma^2 is the mean square error. Note that the noncentrality parameter is zero when the null hypothesis is true, that is, when the effect size is zero.
The Effect Size, delta,
is estimated from the data as sqrt[ SS(Hypothesis)/N ]. The effect size can be thought of as the minimum difference in means that you want to detect divided by the total sample size.

Limitations

The %POWER macro is appropriate for fixed-effect linear models fit by PROC GLM only. IT IS NOT APPROPRIATE FOR PROC GLM MODELS USING THE RANDOM, TEST, REPEATED, OR MANOVA STATEMENTS.

The %POWER macro does not accept a given power value as input and report the required sample size. However, using the N= parameter you can try several different sample sizes to see the effect on power.

Example

%include macros(power);        *-- or include in an autocall library;
Title 'Sucrose Data: Speed to traverse a Runway';
data sucrose;
     label SUGAR = 'Sucrose Concentration (%)'
           SPEED = 'Speed in Runway (ft/sec)';
     input SUGAR @ ;
     do SUBJECT = 1 to 8;
        input SPEED @;
        output;
        end;
cards;
 8   1.4 2.0 3.2 1.4 2.3 4.0 5.0 4.7
16   3.2 6.8 5.0 2.5 6.1 4.8 4.6 4.2
32   6.2 3.1 3.2 4.0 4.5 6.4 4.5 4.1
64   5.8 6.6 6.5 5.9 5.9 3.0 5.9 5.6
;
*-- Do the ANOVA, obtain OUTSTAT= data set;
 
proc GLM data=SUCROSE outstat=STATS;
     class SUGAR;
     model SPEED = SUGAR / SS3;
     contrast 'Linear' SUGAR  -3  -1   1  3;
     contrast 'Quad  ' SUGAR   1  -1  -1  1;
     contrast 'Cubic ' SUGAR  -1   3  -3  1;
run;

%power(data=stats, alpha=.01, effect=_all_);
The following output is produced:
                    Power values and estimated sample sizes
                       IF population means = sample means

  OBS    _NAME_    _SOURCE_    _TYPE_      DF      SS         F         PROB

   1     SPEED      ERROR      ERROR       28    47.760      .         .     
   2     SPEED      SUGAR      SS3          3    28.680     5.6047    0.00386
   3     SPEED      Linear     CONTRAST     1    24.336    14.2673    0.00076
   4     SPEED      Quad       CONTRAST     1     0.500     0.2931    0.59250
   5     SPEED      Cubic      CONTRAST     1     3.844     2.2536    0.14450

            Power Calculation for effect(s) SUGAR LINEAR QUAD CUBIC                                         

         Type I  Total Root Mean                             Least     Power
          Error Sample   Square  Effect Power of Adjusted Significant   when
  Source  Rate   Size    Error    Size    Test     Power     Number    N=LSN

  SUGAR   0.01    35      1.31   0.9052  0.72851  0.55453      30     0.60884
  LINEAR  0.01    35      1.31   0.8339  0.83254  0.75935      22     0.53215
  QUAD    0.01    35      1.31   0.1195  0.02038  0.01000     802     0.50085
  CUBIC   0.01    35      1.31   0.3312  0.12164  0.05524     108     0.50338

See also

fpower Power computations for ANOVA designs
meanplot Plotting means for factorial designs
mpower Retrospective power analysis for multivariate GLMs
mpower Retrospective power analysis for univariate GLMs
stat2dat Convert summary dataset to raw data equivalent