I have a df that looks something like this but longer:
df = pd.DataFrame({
'Time' : [1,2,7,10,15,16,77,98,999,1000,1121,1245,1373,1490,1555],
'ID' : ['1', '1', '1', '1', '1', '2', '2', '2', '2', '2', '3', '3', '3', '3', '3'],
'Act' : ['1', '2', '4', '1', '2', '0', '2', '4', '3', '1', '4', '0', '3', '1', '2']})
The Act values range from 0-4. I would like to find the first zero following a 4 and then take the "Time" column value for that zero. If there are two 4s in a row and only then a zero, I am only interested in the 4 closest to the zero.
For the above example I would like the values: 16 and 1245 and a append them to a vector.
Thank you!
CodePudding user response:
You can form groups starting with 4 and get the first 0 of each:
# make groups starting with a 4
group = df['Act'].eq('4').cumsum()
# identify rows with a 0
m = df['Act'].eq('0')
# get first 0 of each group
df.loc[m, 'Time'].groupby(group).first()
output:
Act
1 16
3 1245
Name: Time, dtype: int64
as a vector:
a = df.loc[m, 'Time'].groupby(group).first().to_numpy()
output: array([ 16, 1245])