SAS Macro Programs for Statistical Graphics: OUTLIER
$Version: 1.5 (02 Oct 2003)
Michael Friendly
York University
The OUTLIER macro calculates robust Mahalanobis distances for each
observation in a data set. The results are robust in that
potential outliers do not contribute to the distance of any other
observations. For a multivariate normal sample, the points will lie on a straight line of unit slope; outliers will have squared distances well above the line. A high-resolution plot may be constructed from the
output data set; see the examples in "Section 9.3"
The macro makes one or more passes through the data. Each pass
assigns 0 weight to observations whose DSQ value has
Prob ( chi² ) < PVALUE. The number of
passes should be determined empirically so that no new observations
are trimmed on the last step.
Parameters
- DATA=_LAST_
- Name of the data set to analyze
- VAR=_NUMERIC_
- List of input variables
- ID=
- Name of an optional ID variable to identify
observations
- OUT=CHIPLOT
- Name of the output data set for plotting.
The robust squared distances are named DSQ.
The corresponding theoretical quantiles are
named EXPECTED. The variable _WEIGHT_ has the
value 0 for observations identified as possible
outliers.
- PVALUE=.05
- Probability value of chi²
statistic used to trim observations.
- PASSES=2
- Number of passes of the iterative trimming
procedure.
- PRINT=YES
- Print the OUT= data set?