dataFrame.column.stat — zScore()

Description

The zScore() method of the stat object computes the z-scores of the values in the selected column.

Signature

dataFrame.column(columnName).stat.zScore({ estimator: 'sample', threshold: 1.96 })
Scope
column
Family
stat
Returns
object

Arguments

columnName (string)
The name of the column from which to compute the z-scores.
options (object)
Z-score calculation options.

Options

estimator (string)
The estimator used to compute z-scores.
  • population
  • sample (default)
  • robust
threshold (number)
The z-score threshold used to identify outliers. (Default: 1.96.)

Returns

stat (object)
The statistic details computed for the selected column.
mask ( number[] )
An array containing the z-score associated with each value.
outlierMask ( boolean[] )
An array indicating whether each value exceeds the specified z-score threshold.

Notes

  • The sample estimator computes z-scores using the sample standard deviation.
  • The population estimator computes z-scores using the population standard deviation.
  • The robust estimator uses the median instead of the mean to reduce the influence of extreme values.
  • Values whose absolute z-score exceeds the specified threshold are marked as outliers.

Example

// get the z-score statistics of values of a column of the dataFrame
var stat = dataFrame.column('revenue').stat.zScore();

// add z-scores to the dataFrame
dataFrame.column('z-score').set(stat.mask);

// add outliers mask to the dataFrame
dataFrame.column('outlier').set(stat.outlierMask);