Module Row.Agg
Row-wise aggregations using vectorized operations.
These functions compute aggregations horizontally across columns for each row, similar to pandas' axis=1 operations or Polars' horizontal functions. They are much more efficient than using Row.map_list for common reductions.
Null handling:
- Float columns: NaN values are treated as nulls
- Integer columns:
Int32.min_intandInt64.min_intare treated as nulls - String/Bool columns:
Nonevalues are treated as nulls - When
skipna=true(default): nulls are excluded from computations - When
skipna=false: any null in a row produces a null result
Performance: These functions use vectorized Nx operations internally, which are significantly faster than row-by-row iteration for large datasets.
sum ?skipna df ~names computes row-wise sum across specified columns.
Uses vectorized Nx operations for efficiency. All specified columns must be numeric (float or integer types).
Example:
let df = create [("a", Col.float64 [|1.; 2.; 3.|]);
("b", Col.float64 [|4.; 5.; 6.|])] in
let sums = Row.Agg.sum df ~names:["a"; "b"] in
(* Result: Col.float64 [|5.; 7.; 9.|] *)mean ?skipna df ~names computes row-wise mean across specified columns.
The result is always a float64 column regardless of input types. When skipna=true, the divisor is the count of non-null values per row.
min ?skipna df ~names computes row-wise minimum across specified columns.
The result column preserves the most precise numeric type among inputs (e.g., if any input is float64, result is float64).
max ?skipna df ~names computes row-wise maximum across specified columns.
The result column preserves the most precise numeric type among inputs.
dot df ~names ~weights computes weighted sum (dot product) across columns.
Computes the dot product of row values with the given weights. Equivalent to pandas' df[cols].dot(weights).
Example:
let portfolio_weights = [|0.6; 0.3; 0.1|] in
let weighted_returns = Row.Agg.dot df
~names:["stock_a"; "stock_b"; "stock_c"]
~weights:portfolio_weightsall df ~names returns true if all values in the row are true.
All specified columns must be boolean type. Null values are treated as false for the purpose of the "all" operation.