I have a dataset where a column contains lists of previously received tokenized words. I need to replace a couple of values in these lists.
Initial data set:
df
date text
2022-06-02 [municipal', 'districts', 'mikhailovsky', '84', 'kamyshinsky', '56']
...
Required result:
df_res
date text
2022-06-02 [municipal', 'districts', 'mikhailovka', '84', 'kamyshin', '56']
...
How easy is it to change the values of the elements in the list for all the values of the column?
CodePudding user response:
df = pd.DataFrame([['2022-06-02', ['municipal', 'districts', 'mikhailovsky', '84', 'kamyshinsky', '56']], ['2022-06-02', ['municipal', 'districts', 'mikhailovsky', '84', 'kamyshinsky', '56']], ['2022-06-02', ['municipal', 'districts', 'mikhailovsky', '84', 'kamyshinsky', '56']]], columns=['date', 'text'])
mapper = {'mikhailovsky': 'mikhailovka',
'kamyshinsky': 'kamyshin'}
for k, v in mapper.items():
df.text = df.text.apply(lambda x: [element.replace(k, v) for element in x])
The code above changes df
from this:
date text
0 2022-06-02 [municipal, districts, mikhailovsky, 84, kamyshinsky, 56]
1 2022-06-02 [municipal, districts, mikhailovsky, 84, kamyshinsky, 56]
2 2022-06-02 [municipal, districts, mikhailovsky, 84, kamyshinsky, 56]
into this:
date text
0 2022-06-02 [municipal, districts, mikhailovka, 84, kamyshin, 56]
1 2022-06-02 [municipal, districts, mikhailovka, 84, kamyshin, 56]
2 2022-06-02 [municipal, districts, mikhailovka, 84, kamyshin, 56]