Module Agg.Float
Float aggregations - work on any numeric column (int or float types).
Values are coerced to float for computation. All functions in this module will accept int32, int64, float32, or float64 columns and return float results.
val sum : t -> string -> floatsum df name returns the sum as float.
Works on any numeric column type (int32, int64, float32, float64). Null values are excluded from the sum calculation.
Time complexity: O(n) where n is the number of rows.
val mean : t -> string -> floatmean df name returns the arithmetic mean.
Computes sum divided by count of non-null values. Returns NaN if all values are null or the column is empty.
Time complexity: O(n) where n is the number of rows.
val std : t -> string -> floatstd df name returns the population standard deviation.
Computes standard deviation over non-null values, dividing by n. Returns NaN if no non-null values exist.
Time complexity: O(n) - requires two passes over the data.
val var : t -> string -> floatvar df name returns the population variance.
Computes variance over non-null values, dividing by n. The standard deviation is the square root of this value.
Time complexity: O(n) - requires two passes over the data.
val min : t -> string -> float optionmin df name returns minimum value, None if empty or all nulls.
Null values are ignored during comparison.
Time complexity: O(n) where n is the number of rows.
val max : t -> string -> float optionmax df name returns maximum value, None if empty or all nulls.
Null values are ignored during comparison.
Time complexity: O(n) where n is the number of rows.
val median : t -> string -> floatmedian df name returns the median (50th percentile).
For even-length arrays, returns the average of the two middle values. Null values are excluded before sorting.
Time complexity: O(n log n) due to sorting requirement.
val quantile : t -> string -> q:float -> floatquantile df name ~q returns the q-th quantile where 0 <= q <= 1.
Uses linear interpolation between data points. q=0.5 gives the median, q=0.25 gives the first quartile, etc.
Time complexity: O(n log n) due to sorting requirement.