DB

How to get minus difference in BigQuery

How to get minus difference in BigQuery

How can I get minus difference in BigQuery ?

BigQuery(BQ) is very useful for data analysis or processing.

It is good at handling huge data. It returns summary result in short time.

And also we can use BQ with very cheap price.


Yes. We don't care about price.


Sometimes we want to compare data between 2 tables.

For example, when we change some data procssing program, we must check difference between previous result and modified one.

But How can we get minus difference in BigQuery ?

So today I introduce about "How to get minus difference in BigQuery".

Author


Mid-carieer engineer (AI, system). Good at Python and SQL.

Advantage to read

You can understand "How to get minus difference in BigQuery". Then you don't have to concern about table difference.


Data

First, we have to prepare data.

This case we try to get table difference. So we prepare 2 tables.

Import CSV data below as table minus_sample01.

col1,col2,col3
a,b,c
aa,bb,cc
aaa,bbb,ccc
aaaa,bbbb,cccc

Next, import this data into table minus_sample02.

col1,col2,col3
a,b,c
aa,bb,cc2
aaa2,bbb,ccc
aaaa,bbbb,cccc



Try "MINUS" to get difference

Then how can we get difference between these tables ?


If Oracle, we can use minus ...


So try it.

SELECT * FROM test.minus_sample01
MINUS
SELECT * FROM test.minus_sample02

Result is below.

Syntax error: Expected end of input but got keyword SELECT at [3:1]

Syntax error, it means SQL sentence was wrong.


So we can't use MINUS in BQ.



How to get minus difference in BigQuery


In order to get minus difference in BigQuery, you can use set operator EXCEPT DISTINCT.


You can use it like MINUS clause.

SELECT * FROM test.minus_sample01
EXCEPT DISTINCT
SELECT * FROM test.minus_sample02


Result is like below.

Row col1 col2 col3
1 aa bb cc
2 aaa bbb ccc


You can get rows that only minus_sample01 has.

If you want records that only minus_sample02 has, switch fromt sentence for back sentence.

SELECT * FROM test.minus_sample02
EXCEPT DISTINCT
SELECT * FROM test.minus_sample01

Result is like this.

Row col1 col2 col3
1 aa bb cc2
2 aaa2 bbb ccc

We got unique rows in scond table.



Conclusion

Today I introduced about "How to get minus difference in BigQuery".

In order to get minus difference in BigQuery, we can use method below.

Point

Use set operator EXCEPT DISTINCT.

A + B is UNION ALL, A - B is EXCEPT DISTINCT.



There are some other articles about BigQuey.

If you interested in them, please read them.

If you felt this article is useful, please share.

にほんブログ村 IT技術ブログへ

-DB
-

© 2023 ITips