When transposing a pandas Dataframe which has a named index, the previous index name ( for the index in the first column ) appears as the first entry for the column names.
Example:
original DataFrame df_1
1800 | 1801 | 1802 | |
---|---|---|---|
country | |||
Germany | 38.4 | 38.4 | 38.4 |
df_2 = df_1.T
df_2 is
country | Germany |
---|---|
1800 | 38.4 |
1801 | 38.4 |
1802 | 38.4 |
First question. Why is country now the index for 1800, 1801 etc. and is there a better tranpose option to avoid this?
When trying to rename the index with
df_2.index.set_names(["year"],inplace=True)
the following result is shown
country | Germany |
---|---|
year | |
1800 | 38.4 |
1801 | 38.4 |
1802 | 38.4 |
Question two. Why is country still there and how to remove it?
CodePudding user response:
Country is the name of df_1
index, you can check:
df_check = df_1.rename_axis(columns=['year'], index=['country'])
print(df_check.index.name, df_check.columns.name)
So you could do:
df_1 = df_1.rename_axis(index=None)
CodePudding user response:
First question. Why is country now the index for 1800, 1801 etc. and is there a better tranpose option to avoid this?
'country' is the name/ label of the column axis since it was the name of the index of the DataFrame before transposing.
df_2.index.set_names(["year"],inplace=True)
Question two. Why is country still there and how to remove it?
Because you are only changing the index name of the transposed DataFrame. The name of the column axis ('country') is left unchanged.
How to remove it:
You can use DataFrame.rename_axis
to change the name of both axes (index and column) at the same time. If you want to remove the name of a given axis, just pass None.
For instance,
# or df_2 = df1.T.rename_axis(index='year', columns=None) if you prefer
>>> df_2 = df_1.rename_axis(index=None, columns='year').T
>>> df_2
Germany
year
1800 38.4
1801 38.4
1802 38.4