I am working with a large dataset with multiple columns. It looks similar to this:
import pandas as pd
df = pd.DataFrame(
{'Country': ["NL", "NL", "NL", "BE", "BE", "BE"],
"Year": [2010, 2011, 2012, 2010, 2011, 2012],
"Monkeys": [1, 4, 5, 2, 5, 4],
"Giraffes": [40, 50, 60, 20, 30, 35],
"Pigs": [10, 15, 20, 30, 35, 40]
})
Which looks like this:
Country Year Monkeys Giraffes Pigs
0 NL 2010 1 40 10
1 NL 2011 4 50 15
2 NL 2012 5 60 20
3 BE 2010 2 20 30
4 BE 2011 5 30 35
5 BE 2012 4 35 40
I want it to look like this:
Animals Country 2010 2011 2012
0 Monkeys NL 1 4 5
1 Giraffes NL 40 50 60
2 Pigs NL 10 15 20
3 Monkeys BE 2 5 4
4 Giraffes BE 20 30 35
5 Pigs BE 30 35 40
Using pivot_table and pivot methods I have been unable to get this to work. Is there anyone who can help me solve this?
CodePudding user response:
Use DataFrame.set_index
for Multiindex
and reshape by DataFrame.unstack
first and then Series.unstack
:
df = (df.set_index(['Country','Year'])
.rename_axis('Animals', axis=1)
.unstack([0,1])
.unstack()
.reset_index()
.rename_axis(None, axis=1))
print (df)
Animals Country 2010 2011 2012
0 Monkeys BE 2 5 4
1 Monkeys NL 1 4 5
2 Giraffes BE 20 30 35
3 Giraffes NL 40 50 60
4 Pigs BE 30 35 40
5 Pigs NL 10 15 20
CodePudding user response:
A combination of melt
and pivot
should suffice:
(df.melt(id_vars = ['Country', 'Year'], var_name='Animals')
.pivot(index=['Animals', 'Country'], columns='Year', values = 'value')
.reset_index()
.rename_axis(columns=None)
)
Animals Country 2010 2011 2012
0 Giraffes BE 20 30 35
1 Giraffes NL 40 50 60
2 Monkeys BE 2 5 4
3 Monkeys NL 1 4 5
4 Pigs BE 30 35 40
5 Pigs NL 10 15 20