Choosing values from pandas column with the lowest value-CodePudding

I'm reading a df from csv that has 2 columns showing the prices of various items. In some cases the price is a single int/float, but other cases it could be a range of spaces seperated int/floats or mixture of int/floats with strings.

example df:

  item            prices            
 ------ --------------------------- 
  a      2                          
  b      3.5                        
  c      5                          
  d      0.04                       
  e      1 8 3 4 2                  
  f      0.04 0.04 0.01             
  g      Normal: 4.56Premium: 4.75

What I'm looking is a nice pythonic way to get the prices column to display the lowest possible int/float value for every item. e.g.

  item   prices  
 ------ -------- 
  a           2  
  b         3.5  
  c           5  
  d        0.04  
  e           1  
  f        0.01  
  g        4.56

The only way I could think of solving this problem for items e and f would be to split the value by using str.split(" ") and mapping the output to int or float, but this seems like it would be messy as not all values are the same types. And I don't even know how I would get the lowest value for item g.

any help would be appreciated

CodePudding user response：

Use Series.str.extractall for get integers or floats, convert to floats and get minimal values:

df['prices'] = (df['prices'].str.extractall('(\d \.\d |\d )')[0]
                            .astype(float)
                            .groupby(level=0)
                            .min())
print (df)


  item  prices
0    a    2.00
1    b    3.50
2    c    5.00
3    d    0.04
4    e    1.00
5    f    0.01
6    g    4.56