mvinfluence Influence measures for multivariate regression mvinfluence

SAS Macro Programs: mvinfluence

$Version: 1.0-1 (19 Jan 2012 16:02:01)
Michael Friendly
York University


The mvinfluence macro ( [download] get mvinfluence.sas)

Influence measures for multivariate regression

The MVINFLUENCE macro calculates and plots measures of influence in a multivariate regression model. These are generalizations of the usual influence measures for univariate regression described by Barrett & Ling (1992). This is an initial, experimental implementation designed to explore thiese methods.

Three types of plots are provided (controlled by the PLOTS= argument)

COOKD:
scatterplot of generalized Cook's D against generalized leverage
STRES:
bubble plot of generalized squared studentized residual against generalized leverage using Cook's D as bubble size
LR:
bubble plot of log residual factor against log leverage factor using Cook's D as bubble size. This has the property that diagonal lines with slopes = -1 are contours of equal influence.

Usage

The MVINFLUENCE macro is defined with keyword parameters. The arguments may be listed within parentheses in any order, separated by commas. For example:

  %mvinfluence(data=rohwer, x=n s ns na ss, y=SAT PPVT Raven)

Parameters

DATA=
The name of the input data set [Default: DATA=_LAST_]
Y=
Names of response variables
X=
Names of predictor variables
ID=
The name of an observation ID variable
BUBBLE=
Bubble proportional to: COOKD. No other choices in this implementation [Default: BUBBLE=COOKD]
OUT=
The name of the output data set [Default: OUT=COOKD]
PRINT=
Print the OUT= data set? Not yet implemented. [Default: PRINT=NONE]
PLOTS=
Which plots to produce? NONE, ALL, or one or more of COOK, STRES, LR [Default: PLOTS=ALL]
LABEL=
Points to label: ALL, NONE, or INFL [Default: LABEL=INFL]
INFL=
Criteria for influential [Default: INFL=%STR(((HAT>&HCRIT)|COOKD>.5))]
LSIZE=
Obs label size. The height of other text is controlled by the HTEXT= goption [Default: LSIZE=1.6]
LCOLOR=
Obs label color [Default: LCOLOR=BLACK]
LPOS=
Obs label position [Default: LPOS=5]
LFONT=
Obs label font
BSIZE=
Bubble size scale factor [Default: BSIZE=10]
BSCALE=
Bubble size proportional to AREA or RADIUS [Default: BSCALE=AREA]
BCOLOR=
Bubble color [Default: BCOLOR=RED]
BFILL=
Bubble fill? SOLID or GRADIENT
REFCOL=
Color of reference lines [Default: REFCOL=BLACK]
REFLIN=
Line style for reference lines; 0->NONE [Default: REFLIN=33]
NAME=
The name of the graph in the graphic catalog [Default: NAME=MVINFL]
GOUT=
The name of the graphics catalog

Dependencies

%gskip - gskipDevice-independent macro for multiple plots
%labels - labels Create an Annotate dataset to label observations

References

* Barrett, B. E. & Ling, R. F. (1992). General classes of influence measures for multivariate regression, JASA, 87, # 417, 184-191.

* Barrett, B. E. (2003): Understanding Influence in Multivariate Regression, Communications in Statistics - Theory and Methods, 32:3, 667-680

* Some code from Timm & Mieczkowski, Univariate and Multivariate General Linear Models, http://ftp.sas.com/samples/A55809, Program 5_6.sas, was used in the initial implementation.

Examples

  %include data(rohwer);
  data rohwer;
     set rohwer;
     where group=2;
     drop subjno;
     case = _n_;
  %include macros(mvinfluence);        *-- or include in an autocall library;
  %mvinfluence(data=rohwer, x=n s ns na ss, y=SAT PPVT Raven, id=case,
  bfill=gradient, plots=ALL);

The following plots are produced:

See also

gskipDevice-independent macro for multiple plots
inflglim Influence plots for generalized linear models
inflogis Influence plot for logistic regression models
inflplot Influence plots for regression models
labels Create an Annotate dataset to label observations