Home > Mobile >  How to group by a set of values from two different dataframes
How to group by a set of values from two different dataframes

Time:12-14

I have two dataframes,

df1                       df2
country                   country
US                        AR
US                        AD
CA                        AO
CN                        AU
AR                        US

How do I group by them by combining the country list to a set the compare the difference between two dataframes?

My expected output will be like,

country code   df1_country_count   df2_country_count
AR                   1                    1
AD                   0                    1
AO                   0                    1
AU                   0                    1
US                   2                    1 
CA                   1                    0
CN                   1                    0

CodePudding user response:

(df1.value_counts().to_frame('df1_country_count')
 .join(df2.value_counts().to_frame('df2_country_count'), how='outer')
 .fillna(0).astype('int').rename_axis('country code'))

result:

               df1_country_count    df2_country_count
country code        
AD             0                    1
AO             0                    1
AR             1                    1
AU             0                    1
CA             1                    0
CN             1                    0
US             2                    1

CodePudding user response:

You can use value_counts and then concat.

out = pd.concat([df1.country.value_counts(), 
           df2.country.value_counts()], axis=1).fillna(0).astype(int)
out.columns = ['df1_country', 'df2_country']
print(out)

    df1_country  df2_country
US            2            1
CA            1            0
CN            1            0
AR            1            1
AD            0            1
AO            0            1
AU            0            1
  • Related