Dynamic unpivot SQL for multiple columns in Bigquery

Writing each column into unpivot clause is so bothering for me…

BigQuery(BQ) is very useful for data analysis or processing.
It is good for handling huge data. It returns query result in short time.

During data handling, sometimes we would like to change columns to rows.
But in order to use UNPIVOT, we have to write each column name in unpivot clause.

It is bothering.
It is better to set columns dynamically.

Are there any easier way for unpivoting ?

So today I introduce about "Dynamic unpivot SQL for multiple columns in BigQuery".

目次

UNPIVOT in BigQuery

We can use BigQuery UNPIVOT  clause like following.

First, prepare data with multiple columns.

WITH Produce AS (
  SELECT 'Kale' as product, 51 as Q1, 23 as Q2, 45 as Q3, 3 as Q4 UNION ALL
  SELECT 'Apple', 77, 0, 25, 2)
SELECT * FROM Produce

+---------+----+----+----+----+
| product | Q1 | Q2 | Q3 | Q4 |
+---------+----+----+----+----+
| Kale    | 51 | 23 | 45 | 3  |
| Apple   | 77 | 0  | 25 | 2  |
+---------+----+----+----+----+

Then we set columns into UNPIVOT operator like IN (Q1, Q2, Q3, Q4).

SELECT * FROM Produce
UNPIVOT(sales FOR quarter IN (Q1, Q2, Q3, Q4))

+---------+-------+---------+
| product | sales | quarter |
+---------+-------+---------+
| Kale    | 51    | Q1      |
| Kale    | 23    | Q2      |
| Kale    | 45    | Q3      |
| Kale    | 3     | Q4      |
| Apple   | 77    | Q1      |
| Apple   | 0     | Q2      |
| Apple   | 25    | Q3      |
| Apple   | 2     | Q4      |
+---------+-------+---------+

Query syntax | BigQuery | Google Cloud

But in this way, we have to set each column name in SQL.

It takes time.

I need easier way.

Dynamic unpivot SQL for multiple columns in BigQuery

In order to unpivot for multiple columns, we can use function.

The function is fhoffa.x.unpivot().

We can use it like below.

SELECT product, unpivotted
FROM Produce a 
  , UNNEST(fhoffa.x.unpivot(a, 'Q')) 
unpivotted

Then we get the result below.

productunpivotted.keyunpivotted.value
KaleQ151
KaleQ223
KaleQ345
KaleQ43
AppleQ177
AppleQ20
AppleQ325
AppleQ42
Result of fhoffa.x.unpivot()

The second parameter of fhoffa.x.unpivot(a, 'Q') decides unpivot target columns.

In this case, columns that have Q in their name are target columns.
We get same result with Q[1-9].

This function was introduced in 2020.
It is user defined function (UDF).
The source code is opened.
So you can define it in your own project.

UNPIVOT multiple columns into tidy pairs with BigQuery and a SQL UDF

Now we can use same SQL with a lot of columns.

Conclusion

Today I explained about "Dynamic unpivot SQL for multiple columns in BigQuery".

In order to unpivot multiple columns, we can use the UDF fhoffa.x.unpivot().

With using regular expressions, we can pick a lot of columns.

Now we don't have to set each column name for unpivot.

この記事が気に入ったら
いいね または フォローしてね!

If you like this article, please share !
  • URLをコピーしました!
  • URLをコピーしました!

Author

karasanのアバター karasan System engineer

Mid-career engineer (AI, Data science, Salesforce, etc.).
Good at Python and SQL.

目次