jitter Add noise to numeric variables to prevent overplotting jitter

SAS Macro Programs: jitter

$Version: 1.1 (12 May 2006)
Michael Friendly
York University

The jitter macro ( [download] get jitter.sas)

Add noise to numeric variables to prevent overplotting

The JITTER macro adds a small amount of noise to numeric variables, usually to avoid overplotting for discrete data.

Method

Usage

The JITTER macro is defined with keyword parameters. The arguments may be listed within parentheses in any order, separated by commas. For example:

  %jitter(var=X1-X5);

Parameters

DATA=
Input data set. If not specified, the most recently created dataset is used.[Default: DATA=_LAST_]
OUT=
Output data set (can be same as input). If not specified, the new dataset is named according to the DATAn convention [Default: OUT=_DATA_]
VAR=
Name(s) of the input variable(s) to be jittered. You can use any of the standard SAS abbreviations for variable lists.
NEW=
A list of names for the jittered result variables(s) (can be the same as theVAR= variable(s)), but then the original variables are lost. [Default: NEW=&VAR]
UNIT=
Unit of var, smallest distance between successive values. For multiple input variables, you can specify a list of (blank separated) numeric values. [Default: UNIT=1]
MULT=
Multiplier for spread of jitter. The random quantity added to the variable is distributed uniformly, U(-.25,.25) * &MULT * &UNIT. [Default: MULT=1]
SEED=
Seed for the random number generator. Setting this to a non-zero variable gives reproducable results. [Default: SEED=0]

Example

%include macros(jitter);        *-- or include in an autocall library;

  *-- Generate some discrete variables;
data test;
   do i=1 to 10;
    x1 = int(10*uniform(0));
	x2 = int(10*uniform(0));
	output;
	end;
	
%jitter(data=test, var=x1 x2, new=y1 y2);
proc print;
	var x1 y1 x2 y2;
proc gplot;
   plot y2 * y1;
Printed output:
                     Obs    x1       y1       x2       y2

                       1     8     8.19519     2    1.86667
                       2     0    -0.14863     1    0.78784
                       3     9     8.82106     9    8.82783
                       4     3     3.22831     8    8.20384
                       5     0    -0.10426     9    8.92621
                       6     0     0.18905     8    7.97462
                       7     1     0.79719     5    5.21546
                       8     2     2.06479     9    8.79396
                       9     8     8.19856     9    9.08425
                      10     7     6.81149     3    3.02668

See also

ellipses Plot bivariate data ellipses
lowess Locally weighted scatterplot smoother
sunplot Sunflower plot for X-Y data