# How to get average, median, mode value in BigQuery

How can we get average, median, mode value in BigQuery ?

BigQuery(BQ) is very useful for data analysis or processing.

It is good at handling huge data. It returns summary result in short time.

And we want average, median, mode value when we analyse data.

How can we get these values in BigQuery ?

So today I introduce about "How to get average, median, mode value in BigQuery".

Author

Mid-carieer engineer (AI, system). Good at Python and SQL.

You can understand "How to get average, median, mode value in BigQuery".

## Data

First, we have to prepare data.

Load this CSV file into table `avg_median_sample`.

```text_col,num_col1,num_col2,num_col3
a,10,1,100
a,10,2,100
a,10,3,100
a,10,4,100
a,10,5,100
a,10,6,100
a,10,7,100
a,10,8,100
a,10,9,100
a,10,10,100
b,12,11,200
b,12,12,200
b,12,13,200
b,12,14,200
c,15,15,200
c,15,16,200
c,15,17,300
c,15,18,300
c,15,19,10000
```

Then we can see table like below.

 Row text_col num_col1 num_col2 num_col3 1 a 10 1 100 2 a 10 2 100 3 a 10 3 100 4 a 10 4 100 5 a 10 5 100 6 a 10 6 100 7 a 10 7 100 8 a 10 8 100 9 a 10 9 100 10 a 10 10 100 11 b 12 11 200 12 b 12 12 200 13 b 12 13 200 14 b 12 14 200 15 c 15 15 200 16 c 15 16 200 17 c 15 17 300 18 c 15 18 300 19 c 15 19 10000

## How to get average value

Once data is prepared, get average value.

In order to get average value, you should use `AVG()` function.

```SELECT
text_col,
avg(num_col1) as avg_1,
avg(num_col2) as avg_2,
avg(num_col3) as avg_3
FROM test.avg_median_sample
GROUP BY text_col
```

Result is like below.

 Row text_col avg_1 avg_2 avg_3 1 a 10 5.5 100 2 b 12 12.5 200 3 c 15 17 2200

## How to get median value

Next, get median value.

Median value is not a calculated value.

It is middle value in ordered list.

What is difference between average and madian ?
Average value is the center of gravity. If data scale is same in data list, it is good feature.
But if there is big scale value in data list, average value also become large. In that case, it is not a good feature.
Umm, any example ？
For example, imagin that there are 100 people and Optimus Prime. What does average value mean ?
Oh, Optimus Prime is too heavy. So average value is useless.

We can get median value with `PERCENTILE_CONT()` function.

We use it as below.

```SELECT distinct
text_col,
PERCENTILE_CONT(num_col1, 0.5) OVER(PARTITION BY text_col) AS median_1,
PERCENTILE_CONT(num_col2, 0.5) OVER(PARTITION BY text_col) AS median_2,
PERCENTILE_CONT(num_col3, 0.5) OVER(PARTITION BY text_col) AS median_3
FROM test.avg_median_sample
ORDER BY text_col
```

Then we get this result.

 Row text_col median_1 median_2 median_3 1 a 10 5.5 100 2 b 12 12.5 200 3 c 15 17 300

Average value of `text_col="c"` and `num_col3` was 2200.

Median value is 300. So we can understand this difference.

And if you change second parameter of `PERCENTILE_CONT()` from `0.5` to `0.1` , you can get 10% position value. If `0.75`, you can get 75% position value.

## How to get mode value

Then try to get mode value.

The mode is the value that appears most frequently in a data set.

Frequency is important. `100` and `100.1` are different values.

So mode doesn't match for real numbers.

It is good to use for checking group frequency.

In order to get mode value, you vcan use `APPROX_TOP_COUNT()`.

```SELECT
text_col,
APPROX_TOP_COUNT(num_col1, 1) top1,
APPROX_TOP_COUNT(num_col2, 1) top2,
APPROX_TOP_COUNT(num_col3, 1) top3
FROM test.avg_median_sample
GROUP BY text_col
ORDER BY text_col
```

Result is below.

 Row text_col top1.value top1.count top2.value top2.count top3.value top3.count 1 a 10 10 1 1 100 10 2 b 12 4 11 1 200 4 3 c 15 5 15 1 200 2

As you see, `.value` is value that appears frequently. And `.count` means appeared count.

## Conclusion

Today I described about "How to get average, median, mode value in BigQuery".

In order to get these values, we can use functions below.

SQL functions

• Average: `AVG()`
• Median: `PERCENTILE_CONT()`
• Mode: `APPROX_TOP_COUNT()`

• `AVG()` is easy to use. But others are little bit complicated.
You will get used to use them.

There are some IT tips.