Home > Net >  Iterating through columns and get back column name when reaching certain figure
Iterating through columns and get back column name when reaching certain figure

Time:02-25

I have a Python Data Frame with different product classes counted and accumulated (cumsum) I would like to create a new column on my dataframe, that shows the Column Name of the first element of each row (left to right) that gets to an accumulated figure of 10, following the below example:

df2 = pd.DataFrame(np.array([['NY',2, 4, 8,10,15,20,36], ['LA',4, 5, 6,7,11,25,29,], ['BC',7, 8, 9,10,19,29,45]]), columns=['CITY','a', 'b', 'c','d','e','f','g'])

df2

expected_result = pd.DataFrame(np.array([['NY','d'], ['LA','e'], ['BC','d']]),
                   columns=['CITY','min'])

expected_result

Thank you so much for helping out on this!!!

Francisco

CodePudding user response:

For first match greater or equal by 10 compare values by DataFrame.ge and use DataFrame.idxmax:

df = df2.set_index('CITY').astype(int).ge(10).idxmax(axis=1).reset_index(name='min')
print (df)
  CITY min
0   NY   d
1   LA   e
2   BC   d

Solution above failed if no value greater or equal, then need test if exist at least match:

df2 = pd.DataFrame(np.array([['NY',2, 4, 8,10,15,20,36],
                             ['LA',4, 5, 6,7,11,25,29,], 
                             ['BC',1, 2, 3,4,5,6,9]]), 
                   columns=['CITY','a', 'b', 'c','d','e','f','g'])


m = df2.set_index('CITY').astype(int).ge(10)

df = m.idxmax(axis=1).where(m.any(axis=1), 'no match').reset_index(name='min')
print (df)
  CITY       min
0   NY         d
1   LA         e
2   BC  no match
  • Related