I am trying to get first occurence in months and get the column name as a new column in a pandas dataframe
here is the df;
df = pd.DataFrame({"Jan":[0,0,5],
"Feb":[0,0,0],
"Mar":[1,0,0],
"Apr":[8,2,0],
"May":[0,4,10],
"June":[3,2,3]})
Result should be Mar,Apr,Jan
accordingly
I know that I can make it via for loop iterating one by one but seeking for much elegant solution.
CodePudding user response:
If there is always at least one non 0
value per rows use DataFrame.idxmax
with boolean mask for compare not 0
by DataFrame.ne
:
s = df.ne(0).idxmax(axis=1)
print (s)
0 Mar
1 Apr
2 Jan
dtype: object
General solution:
df = pd.DataFrame({"Jan":[0,0,5],
"Feb":[0,0,0],
"Mar":[1,0,0],
"Apr":[8,0,0],
"May":[0,0,10],
"June":[3,0,3]})
print (df)
Jan Feb Mar Apr May June
0 0 0 1 8 0 3
1 0 0 0 0 0 0
2 5 0 0 0 10 3
m = df.ne(0)
s = m.idxmax(axis=1).where(m.any(axis=1))
print (s)
0 Mar
1 NaN
2 Jan
dtype: object