I am trying to assign the remaining location value in apple
category into others
, but I don't want the location for banana
and waterloon
to get touched through the assignment. This means that I cannot just convert all the blanks to others
. What's the safe way to do this at scale?
import pandas as pd
data = {'fruit_tag': {0: 'apple', 1: 'apple', 2: 'banana', 3: 'apple', 4: 'watermelon'}, 'location': {0: 'Hong Kong', 1: 'Tokyo', 2: '', 3: '', 4: ''}, 'rating': {0: 'bad', 1: 'good', 2: 'good', 3: 'bad', 4: 'good'}, 'measure_score': {0: 0.9529434442520142, 1: 0.952498733997345, 2: 0.9080725312232971, 3: 0.8847543001174927, 4: 0.8679852485656738}}
df = pd.DataFrame.from_dict(data);df
fruit_tag location rating measure_score
0 apple Hong Kong bad 0.952943
1 apple Tokyo good 0.952499
2 banana good 0.908073
3 apple bad 0.884754
4 watermelon good 0.867985
Expected output
fruit_tag location rating measure_score
0 apple Hong Kong bad 0.952943
1 apple Tokyo good 0.952499
2 banana good 0.908073
3 apple Others bad 0.884754
4 watermelon good 0.867985
CodePudding user response:
DataFrame.loc
with the good condition :
- fruit is
apple
location
is empty
df.loc[(df['fruit_tag'] == 'apple') & (df['location'] == ""), 'location'] = 'others'