Skip to main content

Aggregate Functions

Aggregate functions work in the normal way as expected by database experts.

ClickHouse also supports:

NULL Processing

During aggregation, all NULLs are skipped. If the aggregation has several parameters it will ignore any row in which one or more of the parameters are NULL.

Examples:

Consider this table:

┌─x─┬────y─┐
│ 1 │ 2 │
│ 2 │ ᴺᵁᴸᴸ │
│ 3 │ 2 │
│ 3 │ 3 │
│ 3 │ ᴺᵁᴸᴸ │
└───┴──────┘

Let’s say you need to total the values in the y column:

SELECT sum(y) FROM t_null_big
┌─sum(y)─┐
│ 7 │
└────────┘

Now you can use the groupArray function to create an array from the y column:

SELECT groupArray(y) FROM t_null_big
┌─groupArray(y)─┐
│ [2,2,3] │
└───────────────┘

groupArray does not include NULL in the resulting array.

You can use COALESCE to change NULL into a value that makes sense in your use case. For example: avg(COALESCE(column, 0)) with use the column value in the aggregation or zero if NULL:

SELECT
avg(y),
avg(coalesce(y, 0))
FROM t_null_big
┌─────────────avg(y)─┬─avg(coalesce(y, 0))─┐
│ 2.3333333333333335 │ 1.4 │
└────────────────────┴─────────────────────┘

Also you can use Tuple to work around NULL skipping behavior. The a Tuple that contains only a NULL value is not NULL, so the aggregate functions won't skip that row because of that NULL value.

SELECT
groupArray(y),
groupArray(tuple(y)).1
FROM t_null_big;

┌─groupArray(y)─┬─tupleElement(groupArray(tuple(y)), 1)─┐
[2,2,3][2,NULL,2,3,NULL]
└───────────────┴───────────────────────────────────────┘