For example, we have data on cities and the population in them in different years.
City | 2000 | 2005 | 2010 | 2015 |
---|---|---|---|---|
City_1 | 10 | 11 | 12 | 13 |
City_2 | 15 | 14 | 13 | 12 |
City_3 | 20 | 30 | 25 | 35 |
I need to get the DataFrame in this form:
City | Year | Population |
---|---|---|
City_1 | 2000 | 10 |
City_1 | 2005 | 11 |
City_1 | 2010 | 12 |
City_1 | 2015 | 13 |
City_2 | 2000 | 15 |
City_2 | 2005 | 14 |
City_2 | 2010 | 13 |
City_2 | 2015 | 12 |
City_3 | 2000 | 20 |
City_3 | 2005 | 30 |
City_3 | 2010 | 25 |
City_3 | 2015 | 35 |
How can this be done using Python?
CodePudding user response:
Simply use pandas.melt()
and then (how you wanted) put the results in order by city:
This function is useful to massage a DataFrame into a format where one or more columns are identifier variables (id_vars), while all other columns, considered measured variables (value_vars), are “unpivoted” to the row axis, leaving just two non-identifier columns, ‘variable’ and ‘value’.
import pandas as pd
df = pd.read_csv('csv.csv')
df = pd.melt(df, id_vars=['City'], var_name='Year', value_name='Population')
df = df.sort_values('City').set_index('City') # Sort `City` values so that they go together and set the index back to `City`
print(df)
Year Population
City
City_1 2000 10
City_1 2005 11
City_1 2010 12
City_1 2015 13
City_2 2000 15
City_2 2005 14
City_2 2010 13
City_2 2015 12
City_3 2000 20
City_3 2005 30
City_3 2010 25
City_3 2015 35
CodePudding user response:
I think you are looking for the melt function. It's basically a reverse pivot that converts your data to a wide long format.
df_unpivoted = df.melt(id_vars=['City'], var_name='Year', value_name='Population')
print(df_unpivoted)
You can read the documentation here if you want to learn more.
https://pandas.pydata.org/docs/reference/api/pandas.melt.html
CodePudding user response:
unstack:
df.set_index('City').unstack().reset_index().sort_values(by='City')
Output:
level_0 City 0
0 2000 City_1 10
3 2005 City_1 11
6 2010 City_1 12
9 2015 City_1 13
1 2000 City_2 15
4 2005 City_2 14
7 2010 City_2 13
10 2015 City_2 12
2 2000 City_3 20
5 2005 City_3 30
8 2010 City_3 25
11 2015 City_3 35