Home > Back-end >  Extract max and min values of entries in one column based on the combined values of two other column
Extract max and min values of entries in one column based on the combined values of two other column

Time:02-15

I have a csv file that looks something like this:

enter image description here

I'm trying to extract the max and min values from the "Assigned" column based on the combined values of the "Animal ID" and "Farm ID" columns.

Ex. if the "Animal ID" = 10 and "Farm ID" = 10, I need to find its max value (0.23 in this case) and its min value (0.12 in this case) and subtract the two. Then, I want to do the same for when "Animal ID" = 10 and "Farm ID" = 20 and so on. I tried the following, however, I get the following error: KeyError: 'Assigned'. I'm not sure where I'm going wrong. Also, if there's a better way to achieve this than the method I'm using, I'd really appreciate learning about that.

This is what I have done so far:

if df['AnimalID'].all() ==0 and df['Farm ID'].all() ==0:
    print(df.loc['Assigned'].max() - df.loc['Assigned'].min())

Expected Output for Animal ID = 10 and Farm ID = 10: 0.11 Expected Output for Animal ID = 10 and Farm ID = 20: 0.06

CodePudding user response:

Use groupy_apply:

out = df.groupby(['Animal ID', 'Farm ID'], as_index=False)['Assigned'] \
        .apply(lambda x: x.max() - x.min())
print(out)

# Output
   Animal ID  Farm ID  Assigned
0         10       10      0.11
1         10       20      0.06
2         10       30      0.00
  • Related