
BigQuery(BQ) is very useful for data analysis or processing.
It is good at handling huge data. It returns summary result in short time.
Through data handling, sometimes we would like to do numbering.
For example, extract from first to third record in each category.
In order to get ranking in whole data, we should use order by
clause.
But how can we get rank number ?
And also if 4 records have same score, it is difficult to choose 3 records.
How can we number records without duplication ?
So today I introduce about "How to rank data considering duplication in BigQuery".
Author

Advantage to read
You can understand "How to rank data considering duplication in BigQuery". Then you don't have to concern about numbering.
How to rank in BigQuery
There are some methods to rank in BigQuery.
Numbering in BQ
- RANK
- DENSE_RANK
- ROW_NUMBER
They are different functions.
So introduce each function.
Data
Anyway, we should prepare data.
Import this CSV file as rank_sample
table.
Data
id,val 001,100 002,120 003,130 004,100 005,110 006,160 007,170 008,100 009,120 010,200
With using this table, check each numbering functions.
RANK
As its name, RANK
is ranking function.
We can use it like RANK() OVER(ORDER BY <column name that you want to rank>)
.
SQL
SELECT *, RANK() OVER(ORDER BY val) as rank FROM test.rank_sample order by id
Result
Row | id | val | rank |
1 | 1 | 100 | 1 |
2 | 2 | 120 | 5 |
3 | 3 | 130 | 7 |
4 | 4 | 100 | 1 |
5 | 5 | 110 | 4 |
6 | 6 | 160 | 8 |
7 | 7 | 170 | 9 |
8 | 8 | 100 | 1 |
9 | 9 | 120 | 5 |
10 | 10 | 200 | 10 |
According to result, we can see 1st to 10th in rank column.
The val=100
records are all 1st, and next val=110
record is 4th.
It means that rank
function considers same score as same rank and skip next number.
Reference
DENSE_RANK
DENSE_RANK
is numbering function that doesn't skip duplicated rank.
We can use it like DENSE_RANK() OVER(ORDER BY <column name>)
.
SQL
SELECT *, DENSE_RANK() OVER(ORDER BY val) as dense_rank FROM test.rank_sample order by id
Result
Row | id | val | dense_rank |
1 | 1 | 100 | 1 |
2 | 2 | 120 | 3 |
3 | 3 | 130 | 4 |
4 | 4 | 100 | 1 |
5 | 5 | 110 | 2 |
6 | 6 | 160 | 5 |
7 | 7 | 170 | 6 |
8 | 8 | 100 | 1 |
9 | 9 | 120 | 3 |
10 | 10 | 200 | 7 |
We can see that rank is from ast to 7th.
The difference from RANK()
is that val=110
record is not 4th but 2nd.
So DENSE_RANK
does not skip duplicated rank.
Reference
ROW_NUMBER
ROW_NUMBER
is not for ranking, but for setting row number.
But we can use it like ROW_NUMBER() OVER(ORDER BY <column name>)
as same as RANK()
function.
SQL
SELECT *, ROW_NUMBER() OVER(ORDER BY val) as dense_rank FROM test.rank_sample order by id
Result
Row | id | val | row_number |
1 | 1 | 100 | 1 |
2 | 2 | 120 | 5 |
3 | 3 | 130 | 7 |
4 | 4 | 100 | 2 |
5 | 5 | 110 | 4 |
6 | 6 | 160 | 8 |
7 | 7 | 170 | 9 |
8 | 8 | 100 | 3 |
9 | 9 | 120 | 6 |
10 | 10 | 200 | 10 |
According to the result, we can see from 1st to 10th.
Unique point is that there are 2nd and 3rd records in val=100
data.
ROW_NUMBER
just puts row number. So they have different number even in same score records.
Reference
Comparison
We classify RANK
, DENSE_RANK
, ROW_NUMBER
like below.
Function | Description |
---|---|
RANK | Same rank in same score. If there are three 1st records,next is 4th. |
DENSE_RANK | Same rank in same score. If there are three 1st records,next is 2nd. |
ROW_NUMBER | Even in same score, it numbers different number. |
SQL
SELECT *, RANK() OVER(ORDER BY val) as rank, DENSE_RANK() OVER(ORDER BY val) as dense_rank, ROW_NUMBER() OVER(ORDER BY val) as row_number FROM test.rank_sample order by id
Result
Row | id | val | rank | dense_rank | row_number |
1 | 1 | 100 | 1 | 1 | 1 |
2 | 2 | 120 | 5 | 3 | 5 |
3 | 3 | 130 | 7 | 4 | 7 |
4 | 4 | 100 | 1 | 1 | 2 |
5 | 5 | 110 | 4 | 2 | 4 |
6 | 6 | 160 | 8 | 5 | 8 |
7 | 7 | 170 | 9 | 6 | 9 |
8 | 8 | 100 | 1 | 1 | 3 |
9 | 9 | 120 | 5 | 3 | 6 |
10 | 10 | 200 | 10 | 7 | 10 |
Conclusion
Today I described about "How to rank data considering duplication in BigQuery".
There are some ranking functions below.
Numbering function in BQ
Each feature is following.
Function | Description |
---|---|
RANK | Same rank in same score. If there are three 1st records,next is 4th. |
DENSE_RANK | Same rank in same score. If there are three 1st records,next is 2nd. |
ROW_NUMBER | Even in same score, it numbers different number. |

ROW_NUMBER
for avoiding duplication.There are some other articles about BigQuey.
If you interested in them, please read them.
Read more