Graphical methodsLinear modelsCategorical dataSEMsUtility macros
Univariate displays
Bivariate displays
Multivariate displays
Cluster analysis
Maps
boxplot
density
dotplot
nqplot
splot
symbox
symplot
contour
ellipses
lowess
miplot
resline
sparkline
sunplot
surface
andrews
biplot
canplot
coplot
corrgram
cqplot
faces
genscat
hecan
hemat
hemreg
heplot
heplots
mpower
outlier
robcov
scatmat
scatter
stars
gtree
ccmap
map2gen
ANOVA
Regression
Multivariate
Transformations
Power
alleff
effplot
hovplot
meanplot
orpoly
poly
twoway
cpplot
effplot
partial
inflplot
robust
rsqdelta
hecan
hemat
hemreg
heplot
heplots
mvinfluence
boxcox
boxglm
boxtid
sprdplot
fpower
mpower
power
rpower
Discrete distributions
Two-way tables
Mosaic displays
Generalized linear models
distplot
goodfit
ordplot
poisplot
rootgram
agree
agreeplot
corresp
ffold
fourfold
power2x2
powerrxc
sieve
sieveplot
missrc
genscat
mosaic
mosaics
mosmat
addvar
catplot
effplot
halfnorm
inflglim
inflogis
logodds
powerlog
robust
caliscmp
calisgfi
csmpower
eqs2ram
ram2dot
str2ram
Data utility
Graphics utility
Macro utility
Color
combine
combos
dummy
expgrid
interact
jitter
lags
map2gen
multisummary
mvnormal
sas2rd
sas2vsta
scale
slice
sort
stat2dat
table
axis
bars
boxanno
boxaxis
equate
gask
gbank
gdispla
genpat
gensym
gkill
gsize
gskip
inset
label
labels
lines
panels
points
polygons
pscale
regline
rug
expglm
defined
lastword
vexpand
words
bpycolors
brewerpal
colochrt
colorramp

SAS Macro Programs for Statistical Graphics: BIPLOT

$Version: 1.9 (17 Dec 2003)
Michael Friendly
York University

Updated 08/03/2018 17:12:28

BIPLOT macro ( [download] get biplot.sas)

The BIPLOT macro uses PROC IML to carry out the calculations for the biplot display described in "Section 8.7". The program produces

Usage

The original version of this macro required that the columns of the data table be stored as a set of variables in the input dataset. In this arrangment, use the VAR= argument to specify this list of variables and the ID= variable to specify an additional variable whose values are labels for the rows.

Assume a dataset of reaction times to 4 topics in 3 experimental tasks, in a SAS dataset like this:

     TASK   TOPIC1   TOPIC2   TOPIC3   TOPIC4
	  Easy     2.43     3.12     3.68     4.04
	  Medium   3.41     3.91     4.07     5.10
	  Hard     4.21     4.65     5.87     5.69
For this arrangment, the macro would be invoked as follows:
   %biplot(var=topic1-topic4, id=task);

The present version also allows the dataset to contain all response values in a single variable, with two (or more) additional variables to specify the row and column class variables, as is done with PROC GLM in the univariate (non-repeated measures) format. In this case, DO NOT specify an ID= variable, use the VAR= argument to specify the two row and column class variables, and specify the name of response variable as RESPONSE=.

The same data in this format would have 12 observations, and look like:

 		TASK  TOPIC    RT
		Easy    1     2.43
		Easy    2     3.12
		Easy    3     3.68
		...
		Hard    4     5.69
For this arrangment, the macro would be invoked as follows:
   %biplot(var=topic task, response=RT);
In this arrangement, the order of the VAR= variables does not matter. The columns of the two-way table are determined by the variable which varies most rapidly in the input dataset (TOPIC, in the example).

Parameters

DATA=_LAST_
Name of the input data set for the biplot.
VAR =_NUM_
Variables for biplot, when the data is in table form, or list of factor variables, when the data is in GLM form. The list of variables may use any of the SAS abbreviated forms for variable lists (e.g.,X1-X n).
ID=
Name of a character variable used to label the rows (observations) in the biplot display. (Only specify an ID= variable when the data is in table form.)
RESPONSE=
Name of response variable (GLM input form)
DIM =2
Number of biplot dimensions. (Only two-dimensional plots are produced if DIM>2.)
FACTYPE=SYM
Biplot factor type: GH, SYM, JK, or COV. FACTYPE=COV gives the GH scaling, with observation vectors multiplied by sqrt(N-1), and variable vectors divided by the same factor.
SCALE=1
Scale factor for variable vectors. The coordinates for the variables are multiplied by this value. Setting SCALE=0 causes the macro to compute the scale factor to equate the maximum distance from the origin of the variable and observation markers.
POWER=1
Power to which the data values are transformed (POWER=0 means log(y)).
OUT =BIPLOT
Output data set containing biplot coordinates.
ANNO=BIANNO
Output data set containing Annotate labels.
STD=MEAN
Specifies how to standardize the data matrix before the singular value decomposition is computed. If STD=NONE, only the grand mean is subtracted from each value in the data matrix. This option is typically used when row and column means are to be represented in the plot, as in the diagnosis of two-way tables ("Section 7.6.3"). If STD=MEAN, the mean of each column is subtracted. This is the default, and assumes that the variables are measured on commensurable scales. If STD=STD, the column means are subtracted and each column is standardized to unit variance.
COLORS=BLUE RED
Colors used for OBS and VARS.
SYMBOLS=NONE NONE
Symbols used for OBS and VARS. Because the points are usually labeled, symbols are often superfluous.
INTERP=NONE VEC
Interpolation option used for OBS and VARS. In addition to the standard interpolation options provided by the SYMBOL statement, the BIPLOT macro also understands the option VEC to mean a vector from the origin to the row or column point. [Default: INTERP=NONE VEC,
LINES=33 20
Line styles used for OBS and VARS interpolation options.
PLOTREQ=DIM2 * DIM1
Specifies the dimensions to be plotted.
GPLOT=YES
Produce a GPLOT plot? If GPLOT=YES, the two dimensions specified in PLOTREQ= are plotted.
PPLOT=NO
Produce printer plot? If PPLOT=YES, the two dimensions specified in PLOTREQ= are plotted.
HAXIS=
The name of an AXIS statement for the horizontal axis. If neither HAXIS= nor VAXIS= are specified, the program calls the EQUATE macro to produce AXIS statements in which the axes are equated. This creates the axis statements AXIS98 and AXIS99, whether or not a graph is produced. In this case, you should examine the values used for the INC=, XEXTRA=, and YEXTRA= parameters.
VAXIS=
The name of an AXIS statement for the vertical axis.
VTOH=2
The vertical to horizontal aspect ratio (height of one character divided by the width of one character) of the printer device, used to equate axes for a printer plot, when PPLOT=YES. [Default: VTOH=2]
INC=0.5 0.5
X, Y axis tick increments (for the EQUATE macro). Ignored if HAXIS= and VAXIS= are specified. [Default: INC=0.5 0.5]
XEXTRA=0 0
The number of extra X axis tick marks at left and right. Use to allow extra space for labels. [Default: XEXTRA=0 0]
YEXTRA=0 0
The number of extra Y axis tick marks [Default: YEXTRA=0 0]
M0=0.5
Length of origin marker, in data units. If the axes have been properly equated, the lengths of the horizontal and vertical segments should be equal. [Default: M0=0.05]
DIMLAB=
Prefix for dimension labels [Default: DIMLAB=Dimension when DIM=2, otherwise, DIMLAB=Dim]
NAME=
Name of the graphics catalog entry [Default: NAME=biplot]

The OUT= data set

The results from the analysis are saved in the OUT= data set. This data set contains two character variables (_TYPE_ and _NAME_) which identify the observations and numeric variables (DIM1, DIM2, ...) which give the coordinates of each point.

The value of the _TYPE_ variable is 'OBS' for the observations that contain the coordinates for the rows of the data set, and is 'VAR' for the observations that contain the coordinates for the columns. The _NAME_ variable contains the value of ID= variable for the row observations and the variable name for the column observations in the output data set.

GOPTIONS

The height and font used for point labels may be set using the GOPTIONS atatement (HTEXT= and FTEXT=) before calling the macro.

Missing data

The program makes no provision for missing values on any of the variables to be analyzed.

Example

%include data(AUTO) ;
*include macros(biplot);           /* or store in autocall library */
title h=1.6 'Biplot of Automobiles data';
data auto;
   set auto;
   if rep77 ^= . and rep78 ^=.;       /* delete missing data  */
   model = origin || scan(model,1);
goptions htext=1.5;                   /* set symbol height    */
%biplot( data= auto,
         var = gratio  turn  rep77  rep78  price    mpg
               hroom rseat  trunk weight length displa,
         id=model,  scale=.8,
         factype=SYM, std = STD );