I have the following dataframe (constructed as below):
import pandas as pd
df = pd.DataFrame(data=None,columns=pd.MultiIndex.from_product([['Apple','Banana','Orange'],['Data1','Data2','Data3']]),index=[1])
df.loc[:,:] = [1,2,3,4,5,6,7,8,9]
>>> Apple Banana Orange
Data1 Data2 Data3 Data1 Data2 Data3 Data1 Data2 Data3
1 1 2 3 4 5 6 7 8 9
I want to transform this dataframe into the following dataframe (constructed as below):
df = pd.DataFrame(data=[[1,2,3],[4,5,6],[7,8,9]],columns=['Data1','Data2','Data3'],index=['Apple','Banana','Orange'])
>>> Data1 Data2 Data3
Apple 1 2 3
Banana 4 5 6
Orange 7 8 9
I am trying to find the most pythonic way to make this transformation! I have looked into transformations, swapping axes etc... but not sure if this is the right route to take. I want to avoid having to rebuild the dataframe, but rather to just transform it with one or as few lines of code as possible. Thanks!
Also! As a side note, I could not figure out how to input the data directly into the first dataframe at the time of construction (as you can see I had to add it afterwards). What structure should this data take in order to input it directly at the time of construction. I tried multiple variations of lists and lists of lists etc... Thanks!
CodePudding user response:
Maybe...
import pandas as pd
df = pd.DataFrame(data=None,columns=pd.MultiIndex.from_product([['Apple','Banana','Orange'],['Data1','Data2','Data3']]),index=[1])
df.loc[:,:] = [1,2,3,4,5,6,7,8,9]
print(df, '\n\n')
df = df.T.unstack()
df.columns = df.columns.droplevel()
print(df, '\n\n')
Output:
Apple Banana Orange
Data1 Data2 Data3 Data1 Data2 Data3 Data1 Data2 Data3
1 1 2 3 4 5 6 7 8 9
Data1 Data2 Data3
Apple 1 2 3
Banana 4 5 6
Orange 7 8 9
CodePudding user response:
If you want to reorder or change a little bit the columns or the rows, you can explicitly create a new DataFrame out of the input one:
new_index = df.columns.levels[0]
new_columns = df.columns.levels[1]
new_df = pd.DataFrame(
data=[[df.iloc[0][(n, c)] for c in new_columns] for n in new_index],
index=pd.Index(new_index),
columns=new_columns,
)
The result is:
Data1 Data2 Data3
Apple 1 2 3
Banana 4 5 6
Orange 7 8 9