jitter |
Add noise to numeric variables to prevent overplotting |
jitter |
SAS Macro Programs: jitter
$Version: 1.1 (12 May 2006)
Michael Friendly
York University
Add noise to numeric variables to prevent overplotting
The JITTER macro adds a small amount of noise to numeric variables, usually
to avoid overplotting for discrete data.
Method
Usage
The JITTER macro is defined with keyword parameters.
The arguments may be listed within parentheses in any order, separated
by commas. For example:
%jitter(var=X1-X5);
Parameters
- DATA=
-
Input data set. If not specified, the most
recently created dataset is used.[Default:
DATA=_LAST_
]
- OUT=
-
Output data set (can be same as input). If not specified, the new
dataset is named according to the DATAn convention
[Default:
OUT=_DATA_
]
- VAR=
-
Name(s) of the input variable(s) to be jittered.
You can use any of the standard SAS abbreviations for variable lists.
- NEW=
-
A list of names for the jittered result variables(s) (can be the same as theVAR= variable(s)),
but then the original variables are lost.
[Default:
NEW=&VAR
]
- UNIT=
-
Unit of var, smallest distance between successive
values. For multiple input variables, you can specify a list
of (blank separated) numeric values.
[Default:
UNIT=1
]
- MULT=
-
Multiplier for spread of jitter.
The random quantity added to the variable is distributed
uniformly,
U(-.25,.25) * &MULT * &UNIT.
[Default:
MULT=1
]
- SEED=
-
Seed for the random number generator. Setting this to a
non-zero variable gives reproducable results. [Default:
SEED=0
]
Example
%include macros(jitter); *-- or include in an autocall library;
*-- Generate some discrete variables;
data test;
do i=1 to 10;
x1 = int(10*uniform(0));
x2 = int(10*uniform(0));
output;
end;
%jitter(data=test, var=x1 x2, new=y1 y2);
proc print;
var x1 y1 x2 y2;
proc gplot;
plot y2 * y1;
Printed output:
Obs x1 y1 x2 y2
1 8 8.19519 2 1.86667
2 0 -0.14863 1 0.78784
3 9 8.82106 9 8.82783
4 3 3.22831 8 8.20384
5 0 -0.10426 9 8.92621
6 0 0.18905 8 7.97462
7 1 0.79719 5 5.21546
8 2 2.06479 9 8.79396
9 8 8.19856 9 9.08425
10 7 6.81149 3 3.02668
See also
ellipses Plot bivariate data ellipses
lowess Locally weighted scatterplot smoother
sunplot Sunflower plot for X-Y data