SAS Macro Programs: meanplot
$Version: 1.5 (17 Mar 2003)
Michael Friendly
York University
Plot means for factorial designs
The meanplot macro produces 1-way, 2-way, or 3-way plots of means for
a factorial design with any number of factor variables. One-way plots
show main effect means (and optional standard error bars) for each level
of the factor variable. Two-way plots show the means for two factors
as a set of curves for one variable, plotted against the other variable.
Three-way means are displayed as a collection of two-way panels, optionally
including one additional panel showing the means collapsed over the panel
variable. By default both lineprinter (PROC PLOT) plots and high-res
(PROC GPLOT) plots are drawn.
For a 3-factor design, if the factors are A, B, C, the macro, by default,
uses variable A as the horizontal(XVAR =) variable in the plots, plots
separate curves for each value of variable B, and produces separate panels
for each level of variable C, as well as one additional panel for the average
of means over variable C. You can obtain different views of the means
by reordering the CLASS= variables, or assigning variables to particular
roles with the XVAR=, CVAR=, and PANELS= arguments.
Version 1.3 provides an ADJUST= parameter to give true confidence-interval error bars, possibly adjusted for multiple comparisons using the Bonferroni or Tukey studentized range procedures.
The error bar lengths are calculated as one-half the width of the
confidence interval for the difference between two means in a given
effect. Therefore, the error bars for two means in a given effect
will overlap iff those means do not differ at the specified
ALPHA= value.
When sample sizes are unequal for any effect, the macro
uses the geometric mean of the sample sizes for that effect
as a common value in the calculation of the adjusted error bar lengths.
The
Tukey adjustment uses the PROBMC function, and works only in SAS
Version 6.09 or later.
Method
The meanplot macro uses PROC SUMMARY to calculate
the means and standard errors for all 1-way, 2-way, and
3-way margins of response variable in the data (up to the number
of CLASS= variables specified.)
If an ADJUST= method is specified, the macro uses PROC GLM
to determine the MSE and DFE in the full (up to) 3-way model.
Usage
MEANPLOT is a macro program. To plot the means for a factorial design, specify the name of the
RESPONSE= variable, and the CLASS= factor variables. All other
options have default values.
The arguments may be listed within parentheses in any order, separated
by commas. For example:
%meanplot(data=recall, response=score, class=Group Gender Order);
To combine the panels in a single plot, use the %panels macro:
%panels(rows=1, cols=4);
If you are only interested in the combined plot, you may suppress the
display of the individual panels using GOPTIONS NODISPLAY
(or equivalent, depending on your output device)
before calling meanplot and GOPTIONS DISPLAY
before calling panels.
goptions nodisplay;
%meanplot(data=recall, response=score, class=Group Gender Order);
goptions display;
%panels(rows=1, cols=4);
Arguments
- DATA=
- The name of the SAS dataset to be plotted. If DATA= is not
specified, the most recently created data set is used.
- RESPONSE=
-
Name of the response (dependent) variable in the input data
set. Must be numeric.
- CLASS=
- The names of 1-3 factor variables, separated by spaces.
- FREQ=
- An optional frequency variable. Each observation is treated as if it
were repeated as many times as the value of the FREQ=
variable. Observations with a FREQ= value less than or
equal to 1 are omitted from the calculation. Fractional values
are not allowed.
You must specify the FREQ= variable when input comes from summary
data (means, standard deviations) processed by the
%stat2dat macro.
- XVAR=
- Name of the factor variable to be used as the horizontal
variable in plots. If XVAR= is not specified, the first
CLASS= variable is used.
- CVAR=
- Name of the factor variable to be used as the curve
variable in plots. If CVAR= is not specified, the second
CLASS= variable is used in 2-way and 3-way plots.
- PANELS=
- Name of the factor variable to be used to define multiple
panels in plots. If PANELS= is not specified, the third
CLASS= variable is used in 3-way plots.
- PFMT=
- Format for the PANELS= variable. The default is BEST. if
the panels variable is numeric, and $16. otherwise.
- CMEAN=
- Specifies whether an additional curve, representing the
average over the levels of the CVAR= variable, is added
to each panel.
- Z=
- Std. error multiple for confidence intervals or error
bars drawn in the GPLOT versions of the plots. The default,
Z=1 shows one standard error for each mean plotted. Use
Z=1.96 for approximate individual 95% CI, or Z=0 for plots without error
bars. The Z= parameter is ignored if you specify an ADJUST=
method.
- ADJUST=
- Specifies whether to calculate confidence-interval
error bars, and whether these are
adjusted for multiple comparisons.
In this case, error bars will overlap iff a given pair of means
do not differ significantly.
ADJUST=T or LSD provide standard t-value error bars, unadjusted
for multiple comparisons; ADJUST=BON provides Bonferroni adjusted
error bars for all pairwise comparisons; ADJUST=TUKEY or HSD
provides Tukey-test (studentized range) adjustments for all pairwise comparisons..
- ALPHA=.05
- Specifies the error rate for comparisons made with the
ADJUST= option.
- PPLOT=
- Specifies whether line printer plots are to be done. Default: NO
- GPLOT=
- Specifies whether high-res plots are to be done. Default: YES
- PRINT=
- Specifies whether to print the means. Default: YES
Additional options for high-res PROC GPLOT plots
- ANNO=
- Name of an additional input Annotate data set. If specified,
this is appended to the Annotate data set used to draw error bars in
the plot.
- SYMBOLS=
- List of SAS/GRAPH symbols for the levels of the CVAR=variable.
There should be as many symbols as there are distinct values of
the CVAR=variable.
- COLORS=
- List of SAS/GRAPH colors (for the GPLOT version) for the levels
of the CVAR=variable. There should be as many colors as there
are distinct values of the CVAR=variable.
- LINES=
- List of SAS/GRAPH line styles (for the GPLOT version) for the
levels of the CVAR=variable. There should be as many lines as
there are distinct values of the CVAR=variable.
- HAXIS=
- Axis statement for custom horizontal axis, e.g., HAXIS=AXIS2
- VAXIS=
- Axis statement for custom response axis, e.g., VAXIS=AXIS1. If
no axis statement is defined, the program uses
AXIS1 LABEL=(a=90).
Error bar calculations may cause some annotations to outside the
vertical plot range, producing the message:
DATA SYSTEM REQUESTED, BUT VALUE IS NOT IN GRAPH 'Y'
To correct this, specify the ORDER= option in an AXIS statement
to extend the range of the Y axis suitably, for example:
AXIS1 LABEL=(a=90) ORDER=(0 to 30 by 5).
- LEGEND=
- Legend statement for custom CVAR legend, e.g., LEGEND=LEGEND1
If no legend is specified, the program uses
LEGEND1 POSITION=(BOTTOM CENTER INSIDE) OFFSET=(0,1)
MODE=SHARE FRAME;
- PLOC=
- Specifies the location for panel labels identifying the
level of the PANEL= variable. Specify two numbers giving the screen
coordinates in percent of the left edge of the panel label.
The default is PLOC=5 95.
- GOUT=
- Name of output graphics catalog.
Features
- The program does not specify the heights or fonts for any labels
in the plots. You should use the
GOPTIONS HTEXT= FTEXT=
options
to define these.
- Variables are labelled in the plots using their variable label
if a label has been defined in the input data set; otherwise,
the variable name is used as the variable label.
- Plots for the mean of the curve variable are labelled according to
the convention of PROC SUMMARY, where a missing value (. for numeric
and ' ' for character variables) represents the fact that that
variable has been averaged over. Thus, a panel which represents
the means of factors A and C averaged over factor B will be labelled
'B = .' or 'B ='.
- When your goal is to produce a combined plot showing the panels together
you should probably use an empty TITLE statement to suppress the
generation of a title in each panel.
Example
The following example reads means for a 3-way design and uses the
stat2dat macro to create an equivalent raw data set having the
same means and within group standard deviations.
data learning;
do grade = 5,12;
do words = 'low ', 'high';
do feedback = 'Control', 'Pos', 'Neg';
input mean @@;
output;
end; end; end;
lines;
8.8 8.0 7.6
8.0 4.4 3.8
9.0 8.4 8.0
7.8 7.4 7.2
;
%stat2dat(data=learning, class=grade words feedback,
out=raw, mean=mean, n=5, MSE=1.75, depvar=recall);
The means and standard errors are plotted by the
call to meanplot below, with Grade as the panel variable.
Three plots are produced: one for the average response
over grade [Grade (mean)], and one for each grade separately.
The three panels are combined using the %panels macro.
%include goptions;
goptions htext=1.8 htitle=2.3;
%meanplot(data=raw,
freq=freq,
response=recall, class=Feedback Words Grade, cmean=NO, ploc=45 95);
%panels(rows=1, cols=3, equate=Y);
See also
boxplot Box-and-whisker plots
catplot
panels Display a set of plots in rectangular panels
stat2dat Convert summary dataset to raw data equivalent