What this routine does:
This routine is used to clip away outliers from a distribution
of data points. The data are grouped in wavelength bins whose amplitude
can be specified by the user; in each bin the outliers are identified and
discarded according to a set of different algorithms described below.
Binning the Data:
The binsize is defined either by the routine or by the user. It may be a variable width, changing with wavelength as defined by a resolution, or a fixed width in lambda. The routine tries to preset the binsize to a good choice based on the data.
SWS: The ``Auto'' binsize (a resolution value) represents 5 times the theoretical point source resolution (or one fifth of the resolution element) of the AOT used. The actual resolution can be different (e.g. for extended sources), and it is recommended that the user try out different bin sizes, using the default value as a starting point.
LWS: The ``Auto" binsize (a fixed width value) represents a width smaller than the minimum wavelength sampling possible for the AOT used, but larger than the jitter in wavelength found in any scan. It is chosen so as to preserve the sample spacing of individual scans (see below).
CAM-CVF / PHT-S: The ``Auto'' binsize (a fixed width value) is computed as the median of the separation in wavelength between adjacent data points.
The bins are used to group the data for clipping, but
the bin placement on the x-axis is not necessarily uniform. Rather,
the first (shortest wavelength) data point defines the start of the first
bin. All data within a binwidth of that data point are included in the
first bin. The next bin is defined to start at the wavelength of
the next nearest data point in increasing wavelength, and etc.. The clipped
points for each bin are plotted in red in the left panel of the Clip widget.
2 Clipping Techniques available under ISAP:
Common Attributes of the two techniques :
- The clipping threshold is below called CLIP and is expressed in units of RMS. It is pre-selected to 2.5, meaning that the clipping threshold is set to 2.5*RMS. The user has the ability to change the threshold.
- The median used is the IDL median with the /even keyword, i.e., for even numbers of points the median is the average of the central two points.
- Data with flux = 0 are always discarded.
- the RMS of N data points, when N>1, is defined as: RMS = (sum (flux_i - median)^2 / (N-1) ) ^ 0.5
- the smaller the binsize, fewer points will be clipped out of each bin.
- for the two sigma clipping techniques below, the user may select the variable CLIP