# Aggregation functions

The AI & Analytics Engine provides a comprehensive list of aggregation functions in data wrangling (see below).

Title | Description |
---|---|

Count | Count of values. It can be specified as non-null values, null values or simply the row count |

Approximate Count Distinct | Number of distinct values approximated using HyperLogLog++. Null values are ignored |

Minimum (Numeric) | Minimum value |

Maximum (Numeric) | Maximum value |

Sum (Numeric) | Sum of values |

Mean/Average (Numeric) | Average of values |

Standard Deviation (Numeric) | Unbiased sample standard deviation of values |

Variance (Numeric) | Unbiased sample variance of values |

Skewness (Numeric) | Skewness of values |

Kurtosis (Numeric) | Kurtosis of values |

Approximate Median (Numeric) | Approximate median of values |

Approximate Quantile (Numeric) | Approximate k-th quantile of values for a given k |

Mode (Numeric) | Mode of the estimated probability distribution |

First value | Value in the first row, ignoring leading null values. Returns null if all values are null |

Last value | Value in the last row, ignoring trailing null values. Returns null if all values are null |

Most frequent value | Most frequent value approximated using an algorithm that is both fast and efficient for large data |

Top K frequent values | Top k values sorted in descending order based on frequencies for a given k, approximated using an algorithm that is fast and efficient for large data |

Earliest (DateTime) | Earliest datetime (timestamp) value |

Latest (DateTime) | Latest datetime (timestamp) value |

All (Boolean) | Whether all values are true |

Any (Boolean) | Whether any value is true |

Not all (Boolean) | Whether any value is false |

Not any (Boolean) | Whether all values are false |

Top K elements (JSONArray) | Top k array elements sorted in descending order based on frequencies for a given k, approximated using an algorithm that is fast and efficient for large data |

These functions can be applied to the following actions over groups/partitions (for details, see Action Catalogue).

- Aggregate Columns within groups
- Look up aggregated columns from another dataset
- Compute window functions
- Reshape dataset into a pivot table
- Resample data into a regular time series