outlier outlier - Robust multivariate outlier detection outlier

SAS Macro Programs: outlier

$Version: 1.5-2 (01 Aug 2008)
Michael Friendly
York University



OUTLIER macro ( [download] get outlier.sas)

The OUTLIER macro calculates robust Mahalanobis distances for each observation in a data set. The results are robust in that potential outliers do not contribute to the distance of any other observations. For a multivariate normal sample, the points will lie on a straight line of unit slope; outliers will have squared distances well above the line. A high-resolution plot may be constructed from the output data set; see the examples in "Section 9.3"

The macro makes one or more passes through the data. Each pass assigns 0 weight to observations whose DSQ value has Prob ( chi² ) < PVALUE. The number of passes should be determined empirically so that no new observations are trimmed on the last step.

Parameters

DATA=_LAST_
Name of the data set to analyze
VAR=_NUMERIC_
List of input variables
ID=
Name of an optional ID variable to identify observations
OUT=CHIPLOT
Name of the output data set for plotting. The robust squared distances are named DSQ. The corresponding theoretical quantiles are named EXPECTED. The variable _WEIGHT_ has the value 0 for observations identified as possible outliers.
PVALUE=.05
Probability value of chi² statistic used to trim observations.
PASSES=2
Number of passes of the iterative trimming procedure.
PRINT=YES
Print the OUT= data set?