I have the following dataframe:
import pandas as pd
d = {'Stages': ['Stage 1', 'Stage 2', 'Stage 2', 'Stage 2', 'Stage 3', 'Stage 1'], 'Start(s)': [0, 630, 780, 840, 900, 930], 'End(s)': [630, 780, 840, 900, 930, 960]}
df = pd.DataFrame(data=d)
Stages Start(s) End(s)
0 Stage 1 0 630
1 Stage 2 630 780
2 Stage 2 780 840
3 Stage 2 840 900
4 Stage 3 900 930
5 Stage 1 930 960
I would like to get the index where Stage 2 first appears in "Stages" column In this example, it would be 1.
I tried reading discussions on similar problems, but couldn't implement any.
CodePudding user response:
If always exist at least one Stage 2
use Series.idxmax
with compare first value:
print (df['Stages'].eq('Stage 2').idxmax())
1
If possible not exist like Stage 8
use next
with iter
trick:
print (next(iter(df.index[df['Stages'].eq('Stage 8')]), 'not exist'))
not exist
print (next(iter(df.index[df['Stages'].eq('Stage 2')]), 'not exist'))
1
because if not exist matched value idxmax
return first False
value:
print (df['Stages'].eq('Stage 8').idxmax())
0
Another idea is test first index of not missing values is by Series.where
and Series.first_valid_index
:
print (df['Stages'].where(df['Stages'].eq('Stage 8')).first_valid_index())
None
print (df['Stages'].where(df['Stages'].eq('Stage 2')).first_valid_index())
1