Exploratory Data Analysis (EDA) refers to a collection of
techniques, often informal and graphical, for examining
data closely. This short course will survey a number of
the more useful techniques with an emphasis on -
- Data displays which reveal aspects of data not easily
captured in numerical summaries or tabular displays.
- Diagnostic displays which help the viewer decide if
assumptions of an analysis are met.
The following topics will be discussed:
- Examining distributions-- stem & leaf plots, boxplots,
probability plots, data transformations.
- Examining relationships-- enhanced scatterplots, smoothing
scatterplots.
- Regression diagnostics-- residual plots, detecting influential
observations, partial regression plots.
- Multivariate data displays-- stars & profiles, biplots,
detecting multivariate outliers
Extensive examples, most using the SAS System, are presented in the course
notes.