Home > Mobile >  Solved : Updating a DataFrame column using logical comparison
Solved : Updating a DataFrame column using logical comparison

Time:12-20

Based on the suggestions below, I have merged the dataframes into a single DF. Hence changed the title as well. Following is a screen capture of my merged dataframe : enter image description here

I now need to compare Adj Close with the rest of the columns to exactly know where it lies - e.g. is it between PP and R1, R1 and R2, R2 and R3 etc. Based on this, I need to create a new Status column, which would have the relevant statuses. I wrote function as below

def check_status(merged):
if merged['Adj Close'] < merged['R3']:
    merged['Status'] = 'Breached R3'
elif merged['R3'] < merged['Adj Close'] < merged['R2']:
    merged['Status'] = 'Breached R2'
elif merged['R2'] < merged['Adj Close'] < merged['R1']:
    merged['Status'] = 'Breached R1'
elif merged['R1'] < merged['Adj Close'] < merged['PP']:
    merged['Status'] = 'Breached PP on lower side'
elif merged['PP'] < merged['Adj Close'] < merged['S1']:
    merged['Status'] = 'Breached PP on higher side'
elif merged['S1'] < merged['Adj Close'] < merged['S2']:
    merged['Status'] = 'Breached S1'
elif merged['S2'] < merged['Adj Close'] < merged['S3']:
    merged['Status'] = 'Breached S2'
else:
    merged['Status'] = 'Breached S3'
return merged

This gives an error as below :

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

Next I tried with np.where(). The code snippet is below :

merged['Status'] = np.where(merged['Adj Close'] < merged['R3'], 'Breached R3', merged['Adj Close'])
merged['Status'] = np.where(merged['R2'] < merged['Adj Close'] < merged['R1'], 'Breached R1', merged['Adj Close'])
merged['Status'] = np.where(merged['R1'] < merged['Adj Close'] < merged['PP'], 'Breached PP on lower side', merged['Adj Close'])
merged['Status'] = np.where(merged['PP'] < merged['Adj Close'] < merged['S1'], 'Breached PP on higher side', merged['Adj Close'])
merged['Status'] = np.where(merged['S1'] < merged['Adj Close'] < merged['S2'], 'Breached S1', merged['Adj Close'])
merged['Status'] = np.where(merged['S2'] < merged['Adj Close'] < merged['S3'], 'Breached S2', merged['Adj Close'])
merged['Status'] = np.where(merged['S3'] < merged['Adj Close'], 'Breached S3', merged['Adj Close'])

However, I get the same error.

What options do I have to get the Status column updated as above ?

I have a large DF, so performance could be a consideration.

CodePudding user response:

Use np.select with whole dataframe.

conditions = [
    df['Adj Close'] < df['R3'],
    (df['R2'] < df['Adj Close']) & (df['Adj Close'] < df['R1']),
    (df['R1'] < df['Adj Close']) & (df['Adj Close'] < df['PP']),
    (df['PP'] < df['Adj Close']) & (df['Adj Close'] < df['S1']),
    (df['S1'] < df['Adj Close']) & (df['Adj Close'] < df['S2']),
    (df['S2'] < df['Adj Close']) & (df['Adj Close'] < df['S3']),
    df['S3'] < df['Adj Close']
]

values = [
    'Breached R3',
    'Breached R1',
    'Breached PP on lower side',
    'Breached PP on higher side',
    'Breached S1',
    'Breached S2',
    'Breached S3'
]

df['Status'] = np.select(conditions, values, df['Adj Close'])

This will significantly speed up your program @Sachin

CodePudding user response:

I solved this using np.where() in a for loop:

for i in merged:
    merged['Status'] = np.where(merged['Adj Close'] > merged['R3'], 'Breached R3', merged['Status'])
    merged['Status'] = np.where(np.logical_and(merged['R3'] > merged['Adj Close'], merged['Adj Close'] > merged['R2']), 'Breached R2', merged['Status'])
    merged['Status'] = np.where(np.logical_and(merged['R2'] > merged['Adj Close'], merged['Adj Close'] > merged['R1']), 'Breached R1', merged['Status'])
    merged['Status'] = np.where(np.logical_and(merged['R1'] > merged['Adj Close'], merged['Adj Close'] > merged['PP']), 'Breached PP on higher side', merged['Status'])
    merged['Status'] = np.where(np.logical_and(merged['PP'] > merged['Adj Close'], merged['Adj Close'] > merged['S1']), 'Breached PP onn lower side', merged['Status'])
    merged['Status'] = np.where(np.logical_and(merged['S1'] > merged['Adj Close'], merged['Adj Close'] > merged['S2']), 'Breached S1', merged['Status'])
    merged['Status'] = np.where(np.logical_and(merged['S2'] > merged['Adj Close'], merged['Adj Close'] > merged['S3']), 'Breached S2', merged['Status'])
    merged['Status'] = np.where(merged['S3'] > merged['Adj Close'] , 'Breached S3', merged['Status'])

Thanks a lot for all the suggestions.

  • Related