When we use array data, we often use NumPy.
And pandas is also useful to handle table data.
We use both of them.
So sometimes we want to covert both format to each.
Today I introduce about "How to convert between Pandas DataFrame and NumPy ndarray".
Author
Advantage to read
You can understand "How to convert between Pandas DataFrame and NumPy ndarray". Then you can handle array data easily.
How to convert Pandas DataFrame to NumPy ndarray
In order to convert Pandas DataFrame to NumPy ndarray, you can use values
.
With Pandas DataFrame, try df.values
.
Then you can get ndarray format data.
import numpy as np import pandas as pd # dataframe to numpy ndarray data_list1 = [ [1,2,3], [2,3,4], [3,4,5], [4,5,6] ] col_list1 = ["c1","c2","c3"] df1 = pd.DataFrame(data=data_list1, columns=col_list1) print(df1) print(type(df1)) # c1 c2 c3 # 0 1 2 3 # 1 2 3 4 # 2 3 4 5 # 3 4 5 6 # <class 'pandas.core.frame.DataFrame'> np_list = df1.values print(np_list) print(type(np_list)) # [[1 2 3] # [2 3 4] # [3 4 5] # [4 5 6]] # <class 'numpy.ndarray'>
But according to Pandas document, values
is not recommended.
They recommend to use to_numpy()
instead.
np_list = df1.to_numpy() print(np_list) print(type(np_list)) # [[1 2 3] # [2 3 4] # [3 4 5] # [4 5 6]] # <class 'numpy.ndarray'>
How to convert NumPy ndarray to Pandas DataFrame
Next how can we convert NumPy ndarray to Pandas DataFrame ?
In order to change ndarray to DataFrame, set ndarray as source data of DataFrame.
# numpy ndarray to dataframe np_list = np.array(data_list1) print(np_list) print(type(np_list)) # [[1 2 3] # [2 3 4] # [3 4 5] # [4 5 6]] # <class 'numpy.ndarray'> col_list2 = ["c11","c22","c33"] df2 = pd.DataFrame(data=np_list, columns=col_list2) print(df2) print(type(df2)) # c11 c22 c33 # 0 1 2 3 # 1 2 3 4 # 2 3 4 5 # 3 4 5 6 # <class 'pandas.core.frame.DataFrame'>
It is same as converting list data to DataFrame.
If you want to add ndarray as new column, use df["column name"]
.
np_list2 = np.array([4,4,4,4]) df2["c44"] = np_list2 print(df2) print(type(df2)) # c11 c22 c33 c44 # 0 1 2 3 4 # 1 2 3 4 4 # 2 3 4 5 4 # 3 4 5 6 4 # <class 'pandas.core.frame.DataFrame'>
Conclusion
Today I explained about "How to convert between Pandas DataFrame and NumPy ndarray".
We can use below methods to convert.
How to convert
df.values
or df.to_numpy()
DataFrame(data=nplist)
df["column name"]=nplist