Home > Back-end >  find first occurence and get column name in columns pandas dataframe
find first occurence and get column name in columns pandas dataframe

Time:12-11

I am trying to get first occurence in months and get the column name as a new column in a pandas dataframe

here is the df;

df = pd.DataFrame({"Jan":[0,0,5],
               "Feb":[0,0,0],
               "Mar":[1,0,0],
               "Apr":[8,2,0],
               "May":[0,4,10],
               "June":[3,2,3]})

Result should be Mar,Apr,Jan accordingly

I know that I can make it via for loop iterating one by one but seeking for much elegant solution.

CodePudding user response:

If there is always at least one non 0 value per rows use DataFrame.idxmax with boolean mask for compare not 0 by DataFrame.ne:

s = df.ne(0).idxmax(axis=1)
print (s)
0    Mar
1    Apr
2    Jan
dtype: object

General solution:

df = pd.DataFrame({"Jan":[0,0,5],
               "Feb":[0,0,0],
               "Mar":[1,0,0],
               "Apr":[8,0,0],
               "May":[0,0,10],
               "June":[3,0,3]})
print (df)  
   Jan  Feb  Mar  Apr  May  June
0    0    0    1    8    0     3
1    0    0    0    0    0     0
2    5    0    0    0   10     3

m = df.ne(0)
s = m.idxmax(axis=1).where(m.any(axis=1))

print (s)
0    Mar
1    NaN
2    Jan
dtype: object
  • Related