I am working with a dataframe similar to the one below. I have to compare the timestamp of rows. if the timestamp of the rows is within 1 hour then get the name from the row which has a value and populate it in other rows which don't have.
current data
timestamp name Maths Science History
0 2021-08-09 10:18:48 Anni
1 2021-08-09 10:18:51 89 34
2 2021-08-09 10:19:26 76
3 2021-08-11 12:39:24 43
4 2021-08-11 12:39:45 Jeff 65
5 2021-08-11 12:45:05 Jerry 65
expected data
timestamp name Maths Science History
0 2021-08-09 10:18:48 Anni
1 2021-08-09 10:18:51 Anni 89 34
2 2021-08-09 10:19:26 Anni 76
3 2021-08-11 12:39:24 Jeff 43
4 2021-08-11 12:39:45 Jeff 65
5 2021-08-11 12:45:05 Jerry 65
But I can not find logic to this problem. Any idea?
CodePudding user response:
Can you try this ?
n=0
first = df.timestamp[0]
empty_list = []
for time in df.timestamp:
diff = time - first
if diff > pd.Timedelta("1h"):
n =1
first = time
empty_list.append(n)
df["helper"] = empty_list
df["name"] = df.groupby("helper")["name"].ffill().bfill()
del df["helper"]