Is it possible to change cell value by dictionaly in Pandas DataFrame by iteration over list in the-CodePudding

UPDATED

Pandas DataFram I have a column that contains a list like the below in cells

df_lost['Article]

out[6]:

37774    186-2, 185-3, 185-2
37850           358-1, 358-4
37927                       
38266                  111-2
38409                  111-2
38508                       
38519                  185-1
41161           185-4, 357-1
42948                  185-1
Name: Article, dtype: object

for each entry like '182-2', '111-2' etch I have a dictionary like

aDict = {'111-2': 'Text-1', '358-1': 'Text-2'.....}'

is it possible to iterate over the list in the df cells and change the value to the value of a key from the dictionary?

Expected result:

 37774    ['Text 1, Text 2, Text -5']
....

I have tried to use the map function

df['Article'] = df['Article'].map(aDict)

but it doesn't work with the list in a cell. As a temp solution, I have created the dictionary

aDict = {'186-2, 185-3, 185-2': 'Test - 1, test -2, test -3".....}

this works but the number of combinations is extremely big

CodePudding user response：

You need to split the string at the comma delimiters, and then look up each element in the dictionary. You also have to index the list to get the string out of the first element, and wrap the result string back into a list.

def convert_string(string_list, mapping):
    items = string[0].split(', ')
    new_items = [mapping.get(i, i) for i in items]
    return [', '.join(new_items)]

df['Article'] = df['Article'].map(convert_string)

CodePudding user response：

I would use a regex and str.replace here:

aDict = {'111-2': 'Text1', '358-1': 'Text 2'}

import re
pattern = '|'.join(map(re.escape, aDict))

df['Article'] = df['Article'].str.replace(pattern, lambda m: aDict[m.group()], regex=True)

NB. If the dictionary keys can overlap (ab/abc), then they should be sorted by decreasing length to generate the pattern.

Output:


                   Article
37774  186-2, 185-3, 185-2
37850        Text 2, 358-4
37927                     
38266                Text1
38409                Text1
38508                     
38519                185-1
41161         185-4, 357-1
42948                185-1