dataFrame.columns.stat — distance()
Description
The distance() method of the stat object computes the distance between each row and a reference vector.
Signature
dataFrame.columns(...columnNames).stat.distance(values, { method: 'euclidean' })Arguments
...columnNames( string[] )- The name of the columns from which to compute distances.
values(array)- Reference values used to compute distances.
options(object)- Distance computation options.
Option
method(string)- The distance metric used for the computation.
euclidean(default)hamming
Returns
mask( number[] )- An array containing the computed distance for each row.
Notes
- The method computes one distance value per row.
- The number of supplied values must match the number of selected columns.
- Euclidean distance is typically used for numerical data.
- Hamming distance counts the number of differing values between observations.
- Smaller values indicate observations closer to the reference vector.
- A distance of
0indicates an exact match.
Example
// evaluate the distance between the values of 2 columns of the dataFrame and a given set of values
var mask = dataFrame.columns('age', 'income').stat.distance([35, 50000]);
// add the distances into the dataFrame
dataFrame.column('distance').set(mask);