I have a dataset where I would like to map values based on a specific condition. I would like to add a new column and then map a label to an ID if it meets the condition of:
**If ID == AA AND Date >= to Q121: Status = 'closed' AND values within the Used column will be null.**
Data
ID Date Location Used
AA Q121 NY 20
AA Q221 NY 50
AA Q321 NY 10
BB Q121 CA 1
BB Q221 CA 0
BB Q321 CA 500
BB Q421 CA 700
CC Q121 AZ 50
Desired
ID Date Location Used Status
AA Q121 NY closed
AA Q221 NY closed
AA Q321 NY closed
BB Q121 CA 1
BB Q221 CA 0
BB Q321 CA 500
BB Q421 CA 700
CC Q121 AZ 50
Doing
df['Status']=df['ID'].map({'AA':'closed' , })
Is it possible to map using two columns, or would a loop be better fit for this? Any suggestion is appreciated.
CodePudding user response:
You can use np.where
for this:
df['Used'] = np.where(((df.ID == 'AA') & (df.Date >= 'Q121')), '', df['Used'])
df['Status'] = np.where(((df.ID == 'AA') & (df.Date >= 'Q121')), 'closed', '')
# or use np.nan instead of '' above, that's actually preferable (generating NaN values)
print(df)
ID Date Location Used Status
0 AA Q121 NY closed
1 AA Q221 NY closed
2 AA Q321 NY closed
3 BB Q121 CA 1.0
4 BB Q221 CA 0.0
5 BB Q321 CA 500.0
6 BB Q421 CA 700.0
7 CC Q121 AZ 50.0