Home > Enterprise >  Running loop for frequency calculation in Python
Running loop for frequency calculation in Python

Time:10-05

I have the data as shown in the table. I want to use Python. For all the fruits that exist in the year 2016 and 2017, I want the frequencies of country in 2015 for those fruits.

Country Fruit Year
Germany Apple 2015
France Apple 2015
France Apple 2015
Spain Apple 2015
Germany Banana 2015
France Banana 2015
France Apple 2016
Spain Apple 2016
Germany Banana 2016
France Banana 2016
France Banana 2017
France Grapes 2017

The final table I want looks like below:

Fruit Germany France Spain
Apple 1 2 1
Banana 1 1 0
Grapes 0 0 0

CodePudding user response:

Filter by Year and then pivot it with pivot_table:

(df[df.Year == 2015]
  .pivot_table('Year', 'Fruit', 'Country', aggfunc='count')
  .reindex(
    index=df.Fruit.unique(), 
    columns=df.Country.unique()
  ).fillna(0)
  .reset_index())

Country   Fruit  Germany  France  Spain
0         Apple      1.0     2.0    1.0
1        Banana      1.0     1.0    0.0
2        Grapes      0.0     0.0    0.0

Another option is to use crosstab and then select 2015 from the result:

(pd.crosstab(df.Fruit, [df.Country, df.Year])
   .loc[:, pd.IndexSlice[:, 2015]]
   .droplevel(1, 1)
   .reset_index())

Country   Fruit  France  Germany  Spain
0         Apple       2        1      1
1        Banana       1        1      0
2        Grapes       0        0      0

CodePudding user response:

Try:

df_2015 = df[df['Year'] == 2015]
pd.crosstab(df_2015['Fruit'], df_2015['Country']).reindex(df['Fruit'].unique(), fill_value=0)

Output:

Country  France  Germany  Spain
Fruit                          
Apple         2        1      1
Banana        1        1      0
Grapes        0        0      0
  • Related