I have the following dataframe:
df = pd.DataFrame({"id": ['A', 'A', 'A', 'A', 'B', 'B', 'B', 'B', 'C','C','C','C'],
"date": [2015, 2016, 2017, 2018,
2015, 2016, 2017, 2018,
2015, 2016, 2017, 2018],
"col_1": [1,1,1,0,1,0,1,1,0,1,1,1]})
I want to add a new col that should be a dummy 1 for a group if "col_1" has three consecutive 1s. The resulting data-frame should be:
df_new = pd.DataFrame({"id": ['A', 'A', 'A', 'A', 'B', 'B', 'B', 'B', 'C','C','C','C' ],
"date": [2015, 2016, 2017, 2018,
2015, 2016, 2017, 2018,
2015, 2016, 2017, 2018],
"col_1": [1,1,1,0,1,0,1,1,0,1,1,1],
"col_2": [1,1,1,1,0,0,0,0,1,1,1,1]})
CodePudding user response:
Try this
df = pd.DataFrame({"id": ['A', 'A', 'A', 'A', 'B', 'B', 'B', 'B', 'C','C','C','C'],
"date": [2015, 2016, 2017, 2018, 2015, 2016, 2017, 2018, 2015, 2016, 2017, 2018],
"col_1": [1,1,1,0,1,0,1,1,0,1,1,1]})
# identify 3 consecutive 1s
v = df.groupby('id')['col_1'].rolling(3).sum().eq(3)
# identify which ids contain consecutive 1s
ids = next(zip(*v[v].index))
# assign new column
df['col_2'] = df['id'].isin(ids).astype(int)
print(df)
id date col_1 col_2
0 A 2015 1 1
1 A 2016 1 1
2 A 2017 1 1
3 A 2018 0 1
4 B 2015 1 0
5 B 2016 0 0
6 B 2017 1 0
7 B 2018 1 0
8 C 2015 0 1
9 C 2016 1 1
10 C 2017 1 1
11 C 2018 1 1
CodePudding user response:
You can do it like this also.
df['col_2'] = (df.groupby('id')['col_1']
.transform(lambda x: x.rolling(3).sum().eq(3).any())
.astype(int))
df
Output:
id date col_1 col_2
0 A 2015 1 1
1 A 2016 1 1
2 A 2017 1 1
3 A 2018 0 1
4 B 2015 1 0
5 B 2016 0 0
6 B 2017 1 0
7 B 2018 1 0
8 C 2015 0 1
9 C 2016 1 1
10 C 2017 1 1
11 C 2018 1 1