Contact | Old Contact |
---|---|
234255 | 987778 |
343556 | 987877 |
Missing | 984567 |
Missing | |
Missing | 845665 |
343556 | 789998 |
Given the table above, I wish to go through each row under "Contact" and check if Missing. If the row has Missing, use corresponding "Old Contact" values inplace of the text 'Missing'. If old contact is empty, then leave it as 'Missing'
Desired table:
Contact | Old Contact |
---|---|
234255 | 987778 |
343556 | 987877 |
984567 | 984567 |
Missing | |
845665 | 845665 |
343556 | 789998 |
df['Contact'] = df['Contact'].apply(
lambda x: df['Old Contact'] if "Missing" in x else x)
the line above gives me the whole column of 'Old Contact' where there is Missing. I'm not sure how to use index here to get what I want. Thanks in advance!
CodePudding user response:
use mask
df['Contact'].mask(df['Contact'].eq('Missing'), df['Old Contact'].fillna('Missing'))
output:
0 234255
1 343556
2 984567
3 Missing
4 845665
5 343556
Name: Contact, dtype: object
make reult to Contact
column
CodePudding user response:
I haven't used pandas in a while, so I'm sure there's a better solution, but a brute force method could be:
for idx in df.index:
if (df.iloc[idx]['Contact']=='Missing'):
if len(df.iloc[idx]['Old Contact'].strip()):
df.iloc[idx]['Contact']=df.iloc[idx]['Old Contact']
CodePudding user response:
df = df.assign(
Contact=df["Contact"].where(df["Contact"].ne("Missing"), df["Old Contact"]).fillna("Missing")
).fillna("")
print(df)
Contact Old Contact
0 234255 987778.0
1 343556 987877.0
2 984567.0 984567.0
3 Missing
4 845665.0 845665.0
5 343556 789998.0
CodePudding user response:
You have two conditions you are checking. You can check both conditions in a multiple conditions if statement (https://www.geeksforgeeks.org/check-multiple-conditions-in-if-statement-python/) while looping through each entry in df['Contact']
.
for idx in df.index:
if (df.iloc[idx]['Contact']=='Missing') & (df.iloc[idx]['Old Contact']!=''):
df.iloc[idx]['Contact']=df.iloc[idx]['Old Contact']
elif (df.iloc[idx]['Contact']=='Missing') & (df.iloc[idx]['Old Contact']==''):
df.iloc[idx]['Contact']='Missing'
else: pass
Output:
df
Contact Old Contact
0 234255 987778
1 343556 987877
2 984567 984567
3 Missing
4 845665 845665
5 343556 789998