I am going through a time series data base in pandas. When an event happens I want the column I am calculating to store the date from another column. If the event is not present I want it to use whatever was in the cell above in the same column. I have tried the following but I cannot get it to work
DB['TimeFlag'] = np.where(DB['Event'] == 10,DB['Time'],DB['TimeFlag'].shift())
I understand it is referencing itself, but its referencing a cell that should have already been calculated. This feels like there should be a very simple solution but I cannot find one. It is very easy to do in excel and I included a visual of what I want the code to do. Any suggestions?
CodePudding user response:
How about using ffill()?
import pandas as pd
import numpy as np
DB = pd.DataFrame({'Time':['12:00', '12:01','12:02','12:03', '12:04', '12:05'],
'Event': [1, 0, 0, 0, 1, 0]})
DB['TimeFlag'] = np.where(DB['Event'] == 1, DB['Time'], np.nan)
DB['TimeFlag'] = DB['TimeFlag'].ffill()
Output:
Time Event TimeFlag
0 12:00 1 12:00
1 12:01 0 12:00
2 12:02 0 12:00
3 12:03 0 12:00
4 12:04 1 12:04
5 12:05 0 12:04