DB

Dynamic unpivot SQL for multiple columns in Bigquery

Share this for your friends.

Dynamic unpivot SQL for multiple columns in Bigquery

Writing each column into unpivot clause is hard work for me...


BigQuery(BQ) is very useful for data analysis or processing.

It is good at handling huge data. It returns summary result in short time.


Through data handling, sometimes we would like to change columns to rows.

But in order to use UNPIVOT, we have to write each column name.

It is troublesome.

It is better to set columns dynamically.


Are there any easier way for unpivotting?


So today I introduce about "Dynamic unpivot SQL for multiple columns in Bigquery".

Author


Mid-carieer engineer (AI, system). Good at Python and SQL.

Advantage to read

You can understand "Dynamic unpivot SQL for multiple columns in Bigquery". Then you don't have to concern about unpivot.


UNPIVOT in Bigquery

We can use Bigquery's UNPIVOT like following.


First, prepare data with multiple columns.

Data

WITH Produce AS (
  SELECT 'Kale' as product, 51 as Q1, 23 as Q2, 45 as Q3, 3 as Q4 UNION ALL
  SELECT 'Apple', 77, 0, 25, 2)
SELECT * FROM Produce

+---------+----+----+----+----+
| product | Q1 | Q2 | Q3 | Q4 |
+---------+----+----+----+----+
| Kale    | 51 | 23 | 45 | 3  |
| Apple   | 77 | 0  | 25 | 2  |
+---------+----+----+----+----+


Then we set columns into UNPIVOT operator like IN (Q1, Q2, Q3, Q4).

Result

SELECT * FROM Produce
UNPIVOT(sales FOR quarter IN (Q1, Q2, Q3, Q4))

+---------+-------+---------+
| product | sales | quarter |
+---------+-------+---------+
| Kale    | 51    | Q1      |
| Kale    | 23    | Q2      |
| Kale    | 45    | Q3      |
| Kale    | 3     | Q4      |
| Apple   | 77    | Q1      |
| Apple   | 0     | Q2      |
| Apple   | 25    | Q3      |
| Apple   | 2     | Q4      |
+---------+-------+---------+



But in this way, we have to set each column name.

It takes time.


I want easier way.



Dynamic unpivot SQL for multiple columns in Bigquery

In order to unpivot for multiple columns, we can use function.


The function is fhoffa.x.unpivot().

We caan use it like below.

SQL

SELECT product, unpivotted
FROM Produce a 
  , UNNEST(fhoffa.x.unpivot(a, 'Q')) 
unpivotted


Then we get following result.

Result

product unpivotted.key unpivotted.value
Kale Q1 51
Kale Q2 23
Kale Q3 45
Kale Q4 3
Apple Q1 77
Apple Q2 0
Apple Q3 25
Apple Q4 2


The second parameter of fhoffa.x.unpivot(a, 'Q') decides columns.

In this case, columns that have Q in their name are target columns.

We get same result with Q[1-9].


This function was introduced in 2020.

It is user difined function (UDF).

The sourse code is opened.

So you can define it in your own project.


Now we can use same SQL for a lt of columns.



Conclusion

Today I explained about "Dynamic unpivot SQL for multiple columns in Bigquery".


In order to unpivot multiple columns, we can use the UDF fhoffa.x.unpivot().

With using regular expressions, we can pick a lot of columns.


Now we don't have to set each column name.



There are some other articles about BigQuey.

If you interested in them, please read them.


Share this for your friends.

If you felt this article is useful, please share.

にほんブログ村 IT技術ブログへ

-DB
-, ,

© 2023 ITips