Home > OS >  Change values in lists that are in pandas column
Change values in lists that are in pandas column

Time:12-04

I have a dataset where a column contains lists of previously received tokenized words. I need to replace a couple of values in these lists.

Initial data set:

df
date          text
2022-06-02    [municipal', 'districts', 'mikhailovsky', '84', 'kamyshinsky', '56']
...

Required result:

df_res
date          text
2022-06-02    [municipal', 'districts', 'mikhailovka', '84', 'kamyshin', '56']
...

How easy is it to change the values of the elements in the list for all the values of the column?

CodePudding user response:

df = pd.DataFrame([['2022-06-02', ['municipal', 'districts', 'mikhailovsky', '84', 'kamyshinsky', '56']], ['2022-06-02', ['municipal', 'districts', 'mikhailovsky', '84', 'kamyshinsky', '56']], ['2022-06-02', ['municipal', 'districts', 'mikhailovsky', '84', 'kamyshinsky', '56']]], columns=['date', 'text'])

mapper = {'mikhailovsky': 'mikhailovka',
          'kamyshinsky': 'kamyshin'}

for k, v in mapper.items():
    df.text = df.text.apply(lambda x: [element.replace(k, v) for element in x])

The code above changes df from this:

         date                                                       text
0  2022-06-02  [municipal, districts, mikhailovsky, 84, kamyshinsky, 56]
1  2022-06-02  [municipal, districts, mikhailovsky, 84, kamyshinsky, 56]
2  2022-06-02  [municipal, districts, mikhailovsky, 84, kamyshinsky, 56]

into this:

         date                                                   text
0  2022-06-02  [municipal, districts, mikhailovka, 84, kamyshin, 56]
1  2022-06-02  [municipal, districts, mikhailovka, 84, kamyshin, 56]
2  2022-06-02  [municipal, districts, mikhailovka, 84, kamyshin, 56]
  • Related