Running loop for frequency calculation in Python-CodePudding

I have the data as shown in the table. I want to use Python. For all the fruits that exist in the year 2016 and 2017, I want the frequencies of country in 2015 for those fruits.

Country	Fruit	Year
Germany	Apple	2015
France	Apple	2015
France	Apple	2015
Spain	Apple	2015
Germany	Banana	2015
France	Banana	2015
France	Apple	2016
Spain	Apple	2016
Germany	Banana	2016
France	Banana	2016
France	Banana	2017
France	Grapes	2017

The final table I want looks like below:

Fruit	Germany	France	Spain
Apple	1	2	1
Banana	1	1	0
Grapes	0	0	0

CodePudding user response：

Filter by Year and then pivot it with pivot_table:

(df[df.Year == 2015]
  .pivot_table('Year', 'Fruit', 'Country', aggfunc='count')
  .reindex(
    index=df.Fruit.unique(), 
    columns=df.Country.unique()
  ).fillna(0)
  .reset_index())

Country   Fruit  Germany  France  Spain
0         Apple      1.0     2.0    1.0
1        Banana      1.0     1.0    0.0
2        Grapes      0.0     0.0    0.0

Another option is to use crosstab and then select 2015 from the result:

(pd.crosstab(df.Fruit, [df.Country, df.Year])
   .loc[:, pd.IndexSlice[:, 2015]]
   .droplevel(1, 1)
   .reset_index())

Country   Fruit  France  Germany  Spain
0         Apple       2        1      1
1        Banana       1        1      0
2        Grapes       0        0      0

CodePudding user response：

Try:

df_2015 = df[df['Year'] == 2015]
pd.crosstab(df_2015['Fruit'], df_2015['Country']).reindex(df['Fruit'].unique(), fill_value=0)

Output:

Country  France  Germany  Spain
Fruit                          
Apple         2        1      1
Banana        1        1      0
Grapes        0        0      0