dataFrame.column.stat — zScore()
Description
The zScore() method of the stat object computes the z-scores of the values in the selected column.
Signature
dataFrame.column(columnName).stat.zScore({ estimator: 'sample', threshold: 1.96 })Arguments
columnName(string)- The name of the column from which to compute the z-scores.
options(object)- Z-score calculation options.
Options
estimator(string)- The estimator used to compute z-scores.
populationsample(default)robust
threshold(number)- The z-score threshold used to identify outliers. (Default:
1.96.)
Returns
stat(object)- The statistic details computed for the selected column.
mask( number[] )- An array containing the z-score associated with each value.
outlierMask( boolean[] )- An array indicating whether each value exceeds the specified z-score threshold.
Notes
- The
sampleestimator computes z-scores using the sample standard deviation. - The
populationestimator computes z-scores using the population standard deviation. - The
robustestimator uses the median instead of the mean to reduce the influence of extreme values. - Values whose absolute z-score exceeds the specified threshold are marked as outliers.
Example
// get the z-score statistics of values of a column of the dataFrame
var stat = dataFrame.column('revenue').stat.zScore();
// add z-scores to the dataFrame
dataFrame.column('z-score').set(stat.mask);
// add outliers mask to the dataFrame
dataFrame.column('outlier').set(stat.outlierMask);