Home > Software engineering >  Pandas groupby two columns, one by row and another by column
Pandas groupby two columns, one by row and another by column

Time:03-07

I have a csv file that contains n rows of sales of houses.

House House_type Sale_year
One Semi 2010
two Flat 2011
three bungalow 2012
four Semi 2013
five Semi 2013

I want to groupby the data by House_type (flat, bungalow, semi) by sale_year (2010,2011,etc) counts as columns. So I'm trying to output the data in the below format.

House_type 2010 2011 2012 2013
Semi 1 0 0 2
Flat 0 1 0 0
bungalow 0 0 1 0

However, when I run the code, it returns both House_type and Sale_year as two columns.

house= housedata.groupby(["House_type", "Sale_year"])["Sale_year"].count()
house

House_type               Sale_year
Flat                     2011.0          1
bungalow                 2012.0          1
Semi                     2010.0          1
                         2013.0          2

How do I get pandas to output the data desired?

Many thanks

CodePudding user response:

You can achieve the same using get_dummies method of pandas. It basically creates multiple columns for a categorical column and fills it with values.

df = pd.DataFrame({'House_type':['Semi','Flat','Bungalow','Semi','Semi'],'sale_year':[2010,2011,2012,2013,2013]})
df_final = pd.get_dummies(df,columns=['sale_year']).groupby('House_type').sum()
df_final

CodePudding user response:

You can use pivot_table here:

result = pd.pivot_table(df, index='House_type', columns='Sale_year',
                        aggfunc='count', fill_value=0)

It gives directly:

           House               
Sale_year   2010 2011 2012 2013
House_type                     
Flat           0    1    0    0
Semi           1    0    0    2
bungalow       0    0    1    0

You can format it a little if you want:

result.columns = result.columns.get_level_values(1).rename(None)
result = result.reset_index()

to finally get:

  House_type  2010  2011  2012  2013
0       Flat     0     1     0     0
1       Semi     1     0     0     2
2   bungalow     0     0     1     0
  • Related