Home > database >  Replace specific values in a panda column with a value determined by a function
Replace specific values in a panda column with a value determined by a function

Time:05-25

I have data as follows:

import pandas as pd
import country_converter as coco
cc = coco.CountryConverter()

example = pd.DataFrame.from_dict({'Country': {0: 'Fiji', 1: 'Tanzania', 2: 'W. Sahara', 3: 'Canada', 4: 'United States of America', 5: 'Kazakhstan', 6: 'Uzbekistan', 7: 'Papua New Guinea', 8: 'Indonesia', 9: 'Argentina', 10: 'Chile', 11: 'Dem. Rep. Congo', 12: 'Somalia', 13: 'Kenya', 14: 'Sudan', 15: 'Chad', 16: 'Haiti', 17: 'Dominican Rep.', 18: 'Russia', 19: 'Bahamas', 20: 'Falkland Is.', 21: 'Norway'}, 'iso': {0: 'FJI', 1: 'TZA', 2: 'ESH', 3: 'CAN', 4: 'USA', 5: 'KAZ', 6: 'UZB', 7: 'PNG', 8: 'IDN', 9: 'ARG', 10: 'CHL', 11: 'COD', 12: 'SOM', 13: 'KEN', 14: 'SDN', 15: 'TCD', 16: 'HTI', 17: 'DOM', 18: 'RUS', 19: 'BHS', 20: 'FLK', 21: '-99'}}
)

                     Country  iso
0                       Fiji  FJI
1                   Tanzania  TZA
2                  W. Sahara  ESH
3                     Canada  CAN
4   United States of America  USA
5                 Kazakhstan  KAZ
6                 Uzbekistan  UZB
7           Papua New Guinea  PNG
8                  Indonesia  IDN
9                  Argentina  ARG
10                     Chile  CHL
11           Dem. Rep. Congo  COD
12                   Somalia  SOM
13                     Kenya  KEN
14                     Sudan  SDN
15                      Chad  TCD
16                     Haiti  HTI
17            Dominican Rep.  DOM
18                    Russia  RUS
19                   Bahamas  BHS
20              Falkland Is.  FLK
21                    Norway  -99

I would like python to attempt:

example['iso'] = cc.convert(names = example['Country'], to = 'ISO3')

but ONLY if the value of iso=-99

I saw this solution, so I attempted:

example = example.assign(col = [(cc.convert(names = example['Country'], to = 'ISO3')) if iso = '-99' else (example['iso']) for iso in example['iso']])

But that is not the right syntax.

Could someone help me out?

CodePudding user response:

condition = df.iso.eq('-99')
df.iso.loc[condition] = df.Country[condition].apply(lambda x: cc.convert(x, 'ISO3'))

CodePudding user response:

def f(row):
    try:
        if int(row['iso']) == -99:
            return cc.convert(names=example[row['country']].strip(), to='ISO3')
    except:
        pass
    return row['iso']

example['iso'] = example.apply(f, axis=1)

If you want to be more computationally efficient, you could run

idx = example['iso'].apply(int) == -99
example['iso'][idx] = example[idx].apply(f, axis=1)

CodePudding user response:

I would use np.select for this particular problem

import pandas as pd
import numpy as np

example = pd.DataFrame.from_dict({'Country': {0: 'Fiji', 1: 'Tanzania', 2: 'W. Sahara', 3: 'Canada', 4: 'United States of America', 5: 'Kazakhstan', 6: 'Uzbekistan', 7: 'Papua New Guinea', 8: 'Indonesia', 9: 'Argentina', 10: 'Chile', 11: 'Dem. Rep. Congo', 12: 'Somalia', 13: 'Kenya', 14: 'Sudan', 15: 'Chad', 16: 'Haiti', 17: 'Dominican Rep.', 18: 'Russia', 19: 'Bahamas', 20: 'Falkland Is.', 21: 'Norway'}, 'iso': {0: 'FJI', 1: 'TZA', 2: 'ESH', 3: 'CAN', 4: 'USA', 5: 'KAZ', 6: 'UZB', 7: 'PNG', 8: 'IDN', 9: 'ARG', 10: 'CHL', 11: 'COD', 12: 'SOM', 13: 'KEN', 14: 'SDN', 15: 'TCD', 16: 'HTI', 17: 'DOM', 18: 'RUS', 19: 'BHS', 20: 'FLK', 21: '-99'}}
)
df = pd.DataFrame(example)
condition_list = [df['iso'] == '-99']
choice_list = ['ISO3']
#If the condition is met use the choice else keep the data as it is
df['iso'] = np.select(condition_list, choice_list, df['iso'])
df
  • Related