Home > Blockchain >  Return a list of values that match a minimum condition [Pandas]
Return a list of values that match a minimum condition [Pandas]

Time:11-15

I have a dataframe that I'm hoping to return a list of all the values that match the minimum cost per segment. The dataframe looks like this:

Segment Part ID Cost
1 1 $0.5
- - -
1 2 $0.6
1 3 $0.5
1 4 $0.7
2 5 $0.4
2 6 $0.5
2 7 $0.6

Etc.

What I am hoping to end up with is a new dataframe like this:

Segment Part List Min. Cost
1 [1,3] $0.5
2 [5] $0.4

I'm struggling to get this completed. I've tried a few things:

df['Min Segment Price'] = df.groupby(*['Segment']['Cost'].transform(min)

This line correctly adds a column to the full dataframe for what the minimum price is for the Segment.

        min_part_list= df['Part ID'].loc[df['Cost'].eq(df['Cost'].min())].to_list()

Seems like it's only returning the first minimum value for a segment, not all of them.

I also tried this:

        df['Segment Min Part ID']= df['Part ID'].loc[df['Cost'].eq(df['Cost'].min())]

And it's returning the part ID only on the row with the cheapest price for the data frame, not the cheapest price per the segment. I'm unsure how to add in the extra qualification about the segment minimum price.

Thanks!

CodePudding user response:

You can use a double groupby, one to filter, the other to aggregate:

s = pd.to_numeric(df['Cost'].str.strip('$'))

out = (df[s.eq(s.groupby(df['Segment']).transform('min'))]
       .groupby('Segment', as_index=False)
       .agg({'Part ID': list, 'Cost': 'first'})
       )

Output:

   Segment Part ID  Cost
0        1  [1, 3]  $0.5
1        2     [5]  $0.4
  • Related