I have a dataset where I would like to map values based on a specific condition and override the values that are in an existing column.
Data
ID Date Location Used Status
AA Q121 NY 20 ok
AA Q221 NY 50 ok
AA Q321 NY 10 ok
BB Q121 CA 1 ok
BB Q221 CA 0 yes
BB Q321 CA 500 yes
BB Q421 CA 700 no
CC Q121 AZ 50 no
Desired
ID Date Location Used Status
AA Q121 NY closed
AA Q221 NY closed
AA Q321 NY closed
BB Q121 CA 1 ok
BB Q221 CA 0 yes
BB Q321 CA 500 yes
BB Q421 CA 700 no
CC Q121 AZ 50 no
Doing
df['Used'] = np.where(((df.ID == 'AA') & (df.Date >= 'Q121')), '', df['Used'])
df['Status'] = np.where(((df.ID == 'AA') & (df.Date >= 'Q121')), 'closed', '')
The script above replaces 'ok' with 'closed' , however, it also deletes the remaining values in the column. Any suggestion is appreciated.
CodePudding user response:
df.loc[(df['ID'].eq('AA')) & (df['Date'] >= 'Q121'), 'Status'] = 'closed'
CodePudding user response:
This should work -
df.loc[(df.ID == 'AA') & (df.Date >= 'Q121'), 'Used'] = ''
df.loc[(df.ID == 'AA') & (df.Date >= 'Q121'), 'Status'] = 'closed'
CodePudding user response:
Please see the answer using np.where as stated in the question:
d = {
'ID': ['AA', 'AA', 'AA', 'BB', 'BB'],
'Date': ['Q121', 'Q221', 'Q321', 'Q121', 'Q221'],
'Location': ['NY', 'NY', 'NY', 'CA', 'CA'],
'Used': [20, 50, 10, 1, 0],
'Status': ['ok', 'ok', 'ok', 'ok', 'yes']
}
df = pd.DataFrame(d)
idx = np.where((df.ID == 'AA') & (df.Date >= 'Q121'))[0].tolist()
df.loc[idx, 'Used'] = np.nan
df.loc[idx, 'Status'] = 'Closed'
CodePudding user response:
You can use apply for this:
df['Used'], \
df['Status'] = zip(*df.apply(lambda x: ('', 'closed')
if (x.ID == 'AA') and (x.Date >= 'Q121')
else
(x.Used, ''),
axis = 1))