I have a dataframe as follows (example is simplified):
id prediction1 prediction2
1234 Cocker_spaniel german_Shepard
5678 rhodesian_ridgeback australian_shepard
I need to remove the underscores and make sure the string is in lower case so I can search it easier later.
I am not quite sure how to loop through this. My initial student thought is something like what follows:
for row in image_predictions['p1']:
image_predictions['p1'] = image_predictions['p1'].replace('_', ' ')
The above code is for replacing the underscore with a space and I believe the code would be similar for lowercase using the .lower() method.
Any advice to point me in the right direction?
CodePudding user response:
For in place modification you can use:
df.update(df[['prediction1', 'prediction2']]
.apply(lambda c: c.str.lower()
.str.replace('_', ' ', regex=False))
)
Output:
id prediction1 prediction2
0 1234 cocker spaniel german shepard
1 5678 rhodesian ridgeback australian shepard
CodePudding user response:
You can use image_predictions['p1'].apply()
to apply a function to each cell of the p1 column:
def myFunction(x):
return x.replace('_', ' ')
image_predictions['p1'] = image_predictions['p1'].apply(myFunction)
CodePudding user response:
Wanted to see if it was possible to not have to specify the columns for replacement. This approach creates a dict to replace A -> a, B -> b, etc, and _ -> space. Then uses replace
with regex=True
import string
replace_dict = dict(zip(string.ascii_uppercase,string.ascii_lowercase))
replace_dict['_'] = ' '
df.replace(replace_dict, regex=True, inplace=True)
print(df)