I am working on dataframes with different sizes. I need to change the 'True' Boolean in a column in a dataframe to 'False' boolean if the 'True' boolean found between two 'False' booleans at the same column in dataframe.
This is an example of one of dataframes:
df =
index DATE S_N A timestamp delta time \
0 7 2021-01-05 78 4 2021-01-05 NaT NaN
1 8 2021-01-07 78 3 2021-01-07 2 days 48.0
2 9 2021-01-08 78 3 2021-01-08 1 days 24.0
3 10 2021-01-10 78 3 2021-01-10 2 days 48.0
4 11 2021-01-11 78 6 2021-01-11 1 days 24.0
5 12 2021-01-12 78 5 2021-01-12 1 days 24.0
6 13 2021-01-16 78 4 2021-01-16 4 days 96.0
7 14 2021-01-17 78 4 2021-01-17 1 days 24.0
8 15 2021-01-22 78 3 2021-01-22 5 days 120.0
9 16 2021-01-24 78 3 2021-01-24 2 days 48.0
label_number_hours
0 True
1 True
2 False
3 True
4 False
5 False
6 True
7 False
8 True
9 True
This is what I am looking for:
df1 =
index DATE S_N A timestamp delta time \
0 7 2021-01-05 78 4 2021-01-05 NaT NaN
1 8 2021-01-07 78 3 2021-01-07 2 days 48.0
2 9 2021-01-08 78 3 2021-01-08 1 days 24.0
3 10 2021-01-10 78 3 2021-01-10 2 days 48.0
4 11 2021-01-11 78 6 2021-01-11 1 days 24.0
5 12 2021-01-12 78 5 2021-01-12 1 days 24.0
6 13 2021-01-16 78 4 2021-01-16 4 days 96.0
7 14 2021-01-17 78 4 2021-01-17 1 days 24.0
8 15 2021-01-22 78 3 2021-01-22 5 days 120.0
9 16 2021-01-24 78 3 2021-01-24 2 days 48.0
label_number_hours
0 True
1 True
2 False
3 False
4 False
5 False
6 False
7 False
8 True
9 True
This is my code:
df1 = df (subset = 'label_number_hours')
This is the result which I got:
0 False
1 True
2 False
3 True
4 True
5 True
6 True
7 True
8 True
9 True
I am looking to be the output similar to df1 as above.
I need really to your help
CodePudding user response:
You can find indices of all False
values with np.where
and overwrite boolean values between the first and the last one:
import pandas as pd
import numpy as np
df = pd.DataFrame([True, False, True, False, True, False, True], columns=["label_number_hours"])
df["some_other_column"] = np.random.rand(df.shape[0])
falses_idx, = np.where(~df["label_number_hours"])
if falses_idx.size > 0:
df.iloc[falses_idx[0]:falses_idx[-1], df.columns.get_loc("label_number_hours")] = False