I have two dataframe say df1 (primary dataframe) and df2. I want to drop all previous rows from df1 based on a condition from df2. My dataframe are like below:
df2
tradingsymbol Time
0 AAAA 12:54:40
1 BBBB 12:53:33
2 CCCC 12:51:50
df1.head(20)
tradingsymbol Time last_price
0 AAAA 09:20:10 84.40
1 AAAA 09:20:10 85.95
2 AAAA 12:55:60 84.70 <-Valid Row
3 AAAA 13:22:10 86.35 <-Valid Row
4 AAAA 14:55:40 87.10 <-Valid Row
5 BBBB 09:20:13 88.95
6 BBBB 09:20:13 88.80
7 BBBB 09:20:14 88.30
8 BBBB 14:23:11 87.30 <-Valid Row
9 CCCC 09:20:15 90.15
10 CCCC 09:20:16 90.10
11 CCCC 09:20:17 91.05
12 CCCC 09:20:18 90.95
I want to remove all rows from df1 previous to time in Time column of df2 for each tradingsymbol. I want my result as below:
tradingsymbol Time last_price
2 AAAA 12:55:60 84.70
3 AAAA 13:22:10 86.35
4 AAAA 14:55:40 87.10
8 BBBB 14:23:11 87.30
CodePudding user response:
You can use pd.concat
and sort values to put remove flags.
Code
import io
import numpy as np
import pandas as pd
# Sample creation
s1 = '''tradingsymbol,Time,last_price
AAAA,09:20:10,84.40
AAAA,09:20:10,85.95
AAAA,12:55:60,84.70
AAAA,13:22:10,86.35
AAAA,14:55:40,87.10
BBBB,09:20:13,88.95
BBBB,09:20:13,88.80
BBBB,09:20:14,88.30
BBBB,14:23:11,87.30
CCCC,09:20:15,90.15
CCCC,09:20:16,90.10
CCCC,09:20:17,91.05
CCCC,09:20:18,90.95'''
s2 = '''tradingsymbol,Time
AAAA,12:54:40
BBBB,12:53:33
CCCC,12:51:50'''
df1 = pd.read_csv(io.StringIO(s1), dtype={'last_pirce': np.float64})
df1.Time = pd.to_datetime(df1.Time, format='%H:%M:%S').dt.time
df2 = pd.read_csv(io.StringIO(s2))
df2.Time = pd.to_datetime(df2.Time, format='%H:%M:%S').dt.time
# Operations to remove specific rows
df = pd.concat([df1, df2], axis=0).sort_values(['tradingsymbol', 'Time'], ascending=[True, False])
df['flag'] = df.last_price.isnull()
df.flag = df.groupby('tradingsymbol').flag.cumsum()
df = df[df.flag==0].sort_values(['tradingsymbol', 'Time']).drop('flag', axis=1)
Output
tradingsymbol | Time | last_price | |
---|---|---|---|
2 | AAAA | 12:56:00 | 84.7 |
3 | AAAA | 13:22:10 | 86.35 |
4 | AAAA | 14:55:40 | 87.1 |
8 | BBBB | 14:23:11 | 87.3 |