I have two dataframes like below.
d = {'serial#': ['AAA111', 'BBB222', 'CCC333', 'DDD444', 'EEE555'],
'Status': ['Compatible', 'Compatible', 'Not compatible', 'Not compatible', 'Not compatible']}
df1 = pd.DataFrame(data=d)
d2 = {'serial#': ['DDD444', 'EEE555'], 'Model': ['printer', 'printer']}
df2 = pd.DataFrame(data=d2)
print(df1)
print(df2)
serial# Status
0 AAA111 Compatible
1 BBB222 Compatible
2 CCC333 Not compatible
3 DDD444 Not compatible
4 EEE555 Not compatible
serial# Model
0 DDD444 printer
1 EEE555 printer
I need to modify the column 'Status' from df1. For this I need to make a filter using the 'serial#' columns of each of the dataframes. The lines under the 'Status on df1' column will be modified if ' there are items that match on df2 'serial#' column.
Suppose I need to replace data in the filtered rows by the string "reporting", the df1 desired is as below.
serial# Status
0 AAA111 Compatible
1 BBB222 Compatible
2 CCC333 Not compatible
3 DDD444 Reporting
4 EEE555 Reporting
How can I do that? Any help is welcomed.
CodePudding user response:
Use isin
to check serial#
from df2
then use it as boolean mask to set Status
to 'Reporting':
df1.loc[df1['serial#'].isin(df2['serial#']), 'Status'] = 'Reporting'
print(df1)
# Output
serial# Status
0 AAA111 Compatible
1 BBB222 Compatible
2 CCC333 Not compatible
3 DDD444 Reporting
4 EEE555 Reporting
CodePudding user response:
You can perform a left join between both and then a conditional term on Status column:
import pandas as pd
import numpy as np
d = {'serial#': ['AAA111', 'BBB222', 'CCC333', 'DDD444', 'EEE555'], 'Status': ['Compatible', 'Compatible', 'Not compatible', 'Not compatible', 'Not compatible']}
df1 = pd.DataFrame(data=d)
d2 = {'serial#': ['DDD444', 'EEE555'], 'Model': ['printer', 'printer']}
df2 = pd.DataFrame(data=d2)
df3 = df1.merge(df2, on='serial#', how='left')
df3['Status'] = np.where(df3['Model'].notnull(), 'Reporting', df3['Status'])
df3 = df3.drop('Model', axis=1)
df3