|  power | 
Power calculations for general linear models | 
 power | 
SAS Macro Programs: power
$Version: 
Michael Friendly
York University
 
Power calculations for general linear models
The power macro carries out retrospective power analysis
(using the results from a given model fit to actual data)
and/or prospective power analysis
(using parameters specified by the user)
for any general linear model which can be fit by the GLM procedure.
It
calculates the following power related measures:       
- Effect size
 - Power for an effect test              
 - Adjusted power and confidence limits  
 - Least significant number - the smallest sample size required
for the effect to be significant.           
 - Power for least significant number
 
The power macro was written by Kristin Latour,
and modified somewhat to make it more consistent with
my rpower macro.
Her original documentation is available as a PostScript
file, PowerMacro.ps,
and PDF PowerMacro.pdf.
Method
The program reads as input an OUTSTAT=
data set from PROC GLM, containing the sums of squares and
degrees of freedom for all effects in the model.
For each effect specified in the EFFECT= parameter,
the program calculates the effect size (DELTA),
which in turn is used to calculate power.
If values are specified for the N=, SIGMA=, and DELTA=
parameters, all combinations of values of these are appended
to those in the data, and power values are calculated for these
as well.
Usage 
power is a macro program. Values must be supplied for the EFFECT=
parameter
The arguments may be listed within parentheses in any order, separated
by commas. For example:
   %power(data=inputdataset, out=outputdataset, ..., )
Parameters
Default values (if any) are shown after the name of each parameter.
- DATA=_LAST_	
 - OUTSTAT= data set from GLM.  If not specified, the most
           recently created dataset is used.
 - OUT=_POWER_   
 - The name of the output dataset.  If not specified, the new dataset is named _POWER_.
 - EFFECT=  
 - Specifies the name of one or more effects,
given in the _SOURCE_ variable of the input dataset.   Specifying
EFFECT=_ALL_ will calculate power for all effects in the OUTSTAT=
data set.
         
 - CALCS=POWER LSN ADJPOW  
 - Specifies the calculations to report, any one or more of the
following:
- POWER - the nominal (unadjusted) power of the test
 - ADJPOW - power, adjusted for bias
 - POWCI - lower and upper confidence limits for the power
 - LSN - least significant N, that is, the smallest sample size required
for the effect to be significant at significance level ALPHA
 
       
 - SS=SS3  
 - the type of sums of squares to use - either SS1 or SS3      
 - ALPHA=.05  
 - list of significance levels,   
 - N=  
- list of sample sizes, in addition to that contained
in the OUTSTAT= data set.              
- SIGMA=  
   - list of standards deviations, in addition to that contained
in the OUTSTAT= data set.  
 - DELTA=  
 - list of effect sizes, in addition to that contained
in the OUTSTAT= data set.    
 
Details
    Definitions of terms that are relevant to the %POWER macro:
- Prospective power analysis: 
 - Used in the planning phase of a designed
    experiment to determine how large the sample size must be to detect
    an effect of a given size (such as the minimum difference between
    treatment effects that is of practical value).
 - Retrospective power analysis: 
 -  Used after the analysis of an
    experiment to determine the power of the conducted test.
 - Power: 
 -  Is the probability that a false null hypothesis will be rejected.
    Ideally you would design your experiment to be as powerful as possible
    at detecting hypotheses of interest. Values of power range from 0 to 1,
    where values near 0 are low power and values near 1 are high power.
    Power is a function of the sample size (N), the effect size (delta), the
    root mean square error (sigma), and the significance level (alpha). The
    power tells you how likely your experiment is to detect a given
    difference, delta, at a given significance level, alpha. Power has the
    following characteristics:
	 
    - If the true value of the parameter is the hypothesized value, the
      power should be alpha. You do not want to reject the null hypothesis
      when it is true.
    
 - If the true value of the parameters is not the hypothesized value,
      you want the power to be as large as possible.
    
 - The power increases with the sample size. The power increases as
      variance decreases. The power increases as the true parameter gets
      farther from the hypothesized value.
	
 
 - Adjusted Power: 
 -  Is for retrospective power analyses. The adjusted power
    is smaller than the power, as it removes the bias associated with the
    noncentrality parameter. The noncentrality paramater is biased for
    any value other than zero.  Because power is a function of
    population quantities that are not known, the usual practice is to
    substitute sample estimates in power calculations.  If you regard
    these sample estimates as random, you can adjust them to have a
    more proper expectation.  You can also construct a confidence
    interval for this adjusted power, though it is often very wide.  The
    adjusted power and confidence interval can only be computed for your
    observed effect size, delta.
 - The Least Significant Number (LSN) 
 -  is the number of observations needed
    to reduce the variance of the estimates enough to achieve a significant
    result with the given values of alpha, sigma, and delta. If you need
    more data to achieve significance, the LSN helps tell you how many more.
    The LSN has the following characteristics:
	 
    - If the LSN is less than the actual sample size N, then the
      effect is significant. This means that you have more data than
      you need to detect the significance at the given alpha level.
    
 - If the LSN is greater than the actual sample size N, the effect
      is not significant. In this case, if you believe that more data
      will show the same variance and structural results as the current
      sample, the LSN suggests how much data you would need to achieve
      significance.
    
 - If the LSN is equal to N, then the p-value is equal to the
      significance level, alpha. The test is on the border of significance.
    
 - Power calculated when N=LSN is always greater than or equal to 0.5.
	 
 
 - Power when N=LSN 
 -  represents the power associated with using the N
     recommended by the LSN. 
 - The noncentrality parameter, lambda, 
 -  is N*delta^2/sigma^2, where N is
    the total sample size, delta is the effect size, and sigma^2 is the mean
    square error. Note that the noncentrality parameter is zero when the
    null hypothesis is true, that is, when the effect size is zero.
 - The Effect Size, delta,  
 - is estimated from the data as
    sqrt[ SS(Hypothesis)/N ]. The effect size can be thought of as the
    minimum difference in means that you want to detect divided by the total
    sample size.
 
Limitations
    The %POWER macro is appropriate for fixed-effect linear models fit by
    PROC GLM only. IT IS NOT APPROPRIATE FOR PROC GLM MODELS USING THE
    RANDOM, TEST, REPEATED, OR MANOVA STATEMENTS.
    The %POWER macro does not accept a given power value as input and report
    the required sample size. However, using the N= parameter you can
	 try several different sample sizes to see the effect on power.
	 
Example
%include macros(power);        *-- or include in an autocall library;
Title 'Sucrose Data: Speed to traverse a Runway';
data sucrose;
     label SUGAR = 'Sucrose Concentration (%)'
           SPEED = 'Speed in Runway (ft/sec)';
     input SUGAR @ ;
     do SUBJECT = 1 to 8;
        input SPEED @;
        output;
        end;
cards;
 8   1.4 2.0 3.2 1.4 2.3 4.0 5.0 4.7
16   3.2 6.8 5.0 2.5 6.1 4.8 4.6 4.2
32   6.2 3.1 3.2 4.0 4.5 6.4 4.5 4.1
64   5.8 6.6 6.5 5.9 5.9 3.0 5.9 5.6
;
*-- Do the ANOVA, obtain OUTSTAT= data set;
 
proc GLM data=SUCROSE outstat=STATS;
     class SUGAR;
     model SPEED = SUGAR / SS3;
     contrast 'Linear' SUGAR  -3  -1   1  3;
     contrast 'Quad  ' SUGAR   1  -1  -1  1;
     contrast 'Cubic ' SUGAR  -1   3  -3  1;
run;
%power(data=stats, alpha=.01, effect=_all_);
The following output is produced:
                    Power values and estimated sample sizes
                       IF population means = sample means
  OBS    _NAME_    _SOURCE_    _TYPE_      DF      SS         F         PROB
   1     SPEED      ERROR      ERROR       28    47.760      .         .     
   2     SPEED      SUGAR      SS3          3    28.680     5.6047    0.00386
   3     SPEED      Linear     CONTRAST     1    24.336    14.2673    0.00076
   4     SPEED      Quad       CONTRAST     1     0.500     0.2931    0.59250
   5     SPEED      Cubic      CONTRAST     1     3.844     2.2536    0.14450
            Power Calculation for effect(s) SUGAR LINEAR QUAD CUBIC                                         
         Type I  Total Root Mean                             Least     Power
          Error Sample   Square  Effect Power of Adjusted Significant   when
  Source  Rate   Size    Error    Size    Test     Power     Number    N=LSN
  SUGAR   0.01    35      1.31   0.9052  0.72851  0.55453      30     0.60884
  LINEAR  0.01    35      1.31   0.8339  0.83254  0.75935      22     0.53215
  QUAD    0.01    35      1.31   0.1195  0.02038  0.01000     802     0.50085
  CUBIC   0.01    35      1.31   0.3312  0.12164  0.05524     108     0.50338
See also
fpower Power computations for ANOVA designs
meanplot Plotting means for factorial designs
mpower Retrospective power analysis for multivariate GLMs
mpower Retrospective power analysis for univariate GLMs
stat2dat Convert summary dataset to raw data equivalent