Data Analysis and Statistical Graphics Using 'S'

The S programming language for statistics and graphics was developed by the Statistics Group at AT&T Bell Labs. Many feel that S is the language of choice for the development of new statistical tools and for interactive analysis of data. A commercial version, S+, is available on Unix workstations and on PCs under Windows. At York, S+ has been installed on the "nexus" Unix system and other UNIX machines within Mathematics and Statistics.

The purpose of this course is to show how to use S in a Unix environment. The course will have both lecture and hands-on components. Participants will receive access to an account for running S+. The four sessions of the course will cover approximately the following material:

I The Unix Environment: Logging in, mail, editing with vi, directory structure, basic file manipulation and introduction to the X-Window environment.

II Basic Use of S: Data input, manipulating data, printing, history mechanism, S arrays and data frames, random number generation, arithmetic operators; functions for manipulating data structures: apply and category; help facilities.

III Programming and Graphics in S: Writing S functions, one- and two-dimensional graphs, interactive graphs.

IV Introduction to the Use of Statistical Functions in S: Regression and regression diagnostics based on the linear-model function, lm().

An Introduction to S and S-Plus by Phil Spector (Duxbury Press, 1994, 286 pages, ISBN 0-534-19866-X, $35.95) is a recommended text and it is available from the York University Bookstore (under Math 000).