Home > Software engineering >  Python Pandas: How to select the index of the min from one column and use that to select index for o
Python Pandas: How to select the index of the min from one column and use that to select index for o

Time:07-21

I have the following dataframe:

df = pd.DataFrame({'a':[1,2,2,3,3,3,4,4,4], 'b':[-0.1,-0.2,-0.1,-0.1,-0.005,-0.3,0,-0.9,-0.6],'name':['fast','slow1','slow2','slow1','fast1','fast1','slow','fast','slow1']}) 

Output

    a     b      name
0   1   -0.100  fast
1   2   -0.200  slow1
2   2   -0.100  slow2
3   3   -0.100  slow1
4   3   -0.005  fast1
5   3   -0.300  fast1
6   4   0.000   slow
7   4   -0.900  fast
8   4   -0.600  slow1

I am grouping it by column "a"

df.groupby(by=["a"]).agg({"b":"min"})
    b
a   
1   -0.1
2   -0.2
3   -0.3
4   -0.9

How do I select the corresponding "name" column for the min index of column "b"? What I am trying to get at is this:

    b     name
a   
1   -0.1  fast
2   -0.2  slow1
3   -0.3  fast1
4   -0.9  fast

I tried using the "apply" method but for large dataframes it was getting really slow. Is there a way to use the "agg" function here?

CodePudding user response:

One approach, using idxmin:

res = df.groupby(by=["a"]).agg({"b": ["min", pd.NamedAgg(column="name", aggfunc=lambda x: df["name"].iloc[x.idxmin()])]})
print(res)

Output

     b       
   min   name
a            
1 -0.1   fast
2 -0.2  slow1
3 -0.3  fast1
4 -0.9   fast

CodePudding user response:

Apparently, you can easily do it using groupby.min().

dd.groupby('a').min()
Out[250]: 
     b   name
a            
1 -0.1   fast
2 -0.2  slow1
3 -0.3  fast1
4 -0.9   fast

CodePudding user response:

You can do:

df.loc[df.groupby('a')['b'].idxmin()].set_index('a')

output:

     b   name
a            
1 -0.1   fast
2 -0.2  slow1
3 -0.3  fast1
4 -0.9   fast
  • Related