
pandas.DataFrame
is useful to handle table format data.
With using pandas, we can import CSV or Excel data and join them with common key column.
And sometimes you want to get all combination data without any join condition.
But how can we get such all combination data.
So today I will introduce about "How to get cross join with pandas.DataFrame".
Author

Advantage to read
You can understand about "How to get cross join with pandas.DataFrame". Then you don't have to concern about programming with pandas.DataFrame.
pandas.DataFrame "merge"
In order to join two DataFrame, we can use merge
.
Set two DataFrame and common key as parameters of pd.merge()
.
SAMPLE
import pandas as pd data_list1 = [ ["a",1], ["b",2], ["c",3], ["d",4] ] col_list1 = ["column1-1","key"] df1 = pd.DataFrame(data=data_list1, columns=col_list1) print(df1) # column1-1 key # 0 a 1 # 1 b 2 # 2 c 3 # 3 d 4 data_list2 = [ ["e",1], ["f",2], ["g",3], ["h",4] ] col_list2 = ["column2-1","key"] df2 = pd.DataFrame(data=data_list2, columns=col_list2) print(df2) # column2-1 key # 0 e 1 # 1 f 2 # 2 g 3 # 3 h 4 df_crossjoin = pd.merge( df1, df2, on="key" ) print(df_crossjoin) # column1-1 key column2-1 # 0 a 1 e # 1 b 2 f # 2 c 3 g # 3 d 4 h
Then we can join two DataFrame by common key.
Cross join with merge
Next, in order to get all combination data, can we use merge
without common key ?
SAMPLE
import pandas as pd data_list1 = [ ["a"], ["b"], ["c"], ["d"] ] col_list1 = ["column1-1"] df1 = pd.DataFrame(data=data_list1, columns=col_list1) print(df1) # column1-1 # 0 a # 1 b # 2 c # 3 d data_list2 = [ ["e"], ["f"], ["g"], ["h"] ] col_list2 = ["column2-1"] df2 = pd.DataFrame(data=data_list2, columns=col_list2) print(df2) # column2-1 # 0 e # 1 f # 2 g # 3 h df_crossjoin = pd.merge( df1, df2 ) print(df_crossjoin) # MergeError: No common columns to perform merge on. Merge options: left_on=None, right_on=None, left_index=False, right_index=False
If we don't set common column in merge
, it shows error like above.
So what can we do ?
How to get cross join with pandas.DataFrame
In order to join DataFrame, we can use merge
.
But merge
needs common key.
So how can we do cross join ?
There is a method that we add dummy common column to both DataFrame.
SAMPLE
import pandas as pd data_list1 = [ ["a"], ["b"], ["c"], ["d"] ] col_list1 = ["column1-1"] df1 = pd.DataFrame(data=data_list1, columns=col_list1) df1["dummy_key"] = 1 print(df1) # column1-1 dummy_key # 0 a 1 # 1 b 1 # 2 c 1 # 3 d 1 data_list2 = [ ["e"], ["f"], ["g"], ["h"] ] col_list2 = ["column2-1"] df2 = pd.DataFrame(data=data_list2, columns=col_list2) df2["dummy_key"] = 1 print(df2) # column2-1 dummy_key # 0 e 1 # 1 f 1 # 2 g 1 # 3 h 1 df_crossjoin = pd.merge( df1, df2, on="dummy_key" ) print(df_crossjoin) # column1-1 dummy_key column2-1 # 0 a 1 e # 1 a 1 f # 2 a 1 g # 3 a 1 h # 4 b 1 e # 5 b 1 f # 6 b 1 g # 7 b 1 h # 8 c 1 e # 9 c 1 f # 10 c 1 g # 11 c 1 h # 12 d 1 e # 13 d 1 f # 14 d 1 g # 15 d 1 h
Add dummy columns with same value.
Then with using dummy columns as common key, we can do cross join.
In addition, if you don't want to show dummy column, you can delete it by drop
.
SAMPLE
df_crossjoin.drop(columns=["dummy_key"], inplace=True) print(df_crossjoin) # column1-1 column2-1 # 0 a e # 1 a f # 2 a g # 3 a h # 4 b e # 5 b f # 6 b g # 7 b h # 8 c e # 9 c f # 10 c g # 11 c h # 12 d e # 13 d f # 14 d g # 15 d h
Conclusion
Today I will described about "How to get cross join with pandas.DataFrame".
Important points are following.
Point
- Use
merge
to join DataFrame merge
requires common key- In order to get cross join, add dummy column as common key

on
parameter.There are some other articles about pandas.Dataframe.
If you interested in them, please read them.
Read more