Home > database >  How to fill in row value based on previous and subsequent rows in pandas
How to fill in row value based on previous and subsequent rows in pandas

Time:04-22

I have the following data. There are consecutive runs of True per day, however sometimes there's the odd False for a day followed by True.

Due to the nature of this data, it appears this is an error, therefore, I need to replace these odd Falses with True, if the 'sandwiched' False only lasts 1 day.

For Falses that are not between Trues, I want them to remain the same.

pd.DataFrame({
    'start_datetime' : [
        '2019-04-29 00:00:00','2019-04-30 00:00:00','2019-05-01 00:00:00','2019-05-02 00:00:00','2019-05-03 00:00:00','2019-05-07 00:00:00','2019-05-08 00:00:00','2019-05-09 00:00:00','2019-05-10 00:00:00','2019-05-14 00:00:00'],
    'diag_local_code' : [
        True,
        True,
        True,
        True,
        False,
        True,
        True,
        False,
        False,
        False
    ],
    
})

>>>
    start_datetime      diag_local_code
0   2019-04-29 00:00:00 True
1   2019-04-30 00:00:00 True
2   2019-05-01 00:00:00 True
3   2019-05-02 00:00:00 True
4   2019-05-03 00:00:00 False
5   2019-05-07 00:00:00 True
6   2019-05-08 00:00:00 True
7   2019-05-09 00:00:00 False
8   2019-05-10 00:00:00 False
9   2019-05-14 00:00:00 False

CodePudding user response:

Let's say ths dataframe is df:

for i in range (0,10):
    if df['diag_local_code'].iloc[i] == False and df['diag_local_code'].iloc[i-1]==True and df['diag_local_code'].iloc[i 1]==True:
        df['diag_local_code'].iloc[i]=True

This worked for me!

CodePudding user response:

# Setup

df = pd.DataFrame({
    'start_datetime' : [
        '2019-04-29 00:00:00','2019-04-30 00:00:00','2019-05-01 00:00:00','2019-05-02 00:00:00','2019-05-03 00:00:00','2019-05-07 00:00:00','2019-05-08 00:00:00','2019-05-09 00:00:00','2019-05-10 00:00:00','2019-05-14 00:00:00'],
    'diag_local_code' : [
        True,
        True,
        True,
        True,
        False,
        True,
        True,
        False,
        False,
        False
    ],
    
})

# Shift to get row above and below

df['next'] = df['diag_local_code'].shift(-1)
df['previous'] = df['diag_local_code'].shift(1)

# Filter
condition = ((df['next'] == True)) & (df['previous'] == True) & (df['diag_local_code'] == False)
# Assign value
df[condition] = True
  • Related