UPDATED
Pandas DataFram I have a column that contains a list like the below in cells
df_lost['Article]
out[6]:
37774 186-2, 185-3, 185-2
37850 358-1, 358-4
37927
38266 111-2
38409 111-2
38508
38519 185-1
41161 185-4, 357-1
42948 185-1
Name: Article, dtype: object
for each entry like '182-2', '111-2' etch I have a dictionary like
aDict = {'111-2': 'Text-1', '358-1': 'Text-2'.....}'
is it possible to iterate over the list in the df cells and change the value to the value of a key from the dictionary?
Expected result:
37774 ['Text 1, Text 2, Text -5']
....
I have tried to use the map function
df['Article'] = df['Article'].map(aDict)
but it doesn't work with the list in a cell. As a temp solution, I have created the dictionary
aDict = {'186-2, 185-3, 185-2': 'Test - 1, test -2, test -3".....}
this works but the number of combinations is extremely big
CodePudding user response:
You need to split the string at the comma delimiters, and then look up each element in the dictionary. You also have to index the list to get the string out of the first element, and wrap the result string back into a list.
def convert_string(string_list, mapping):
items = string[0].split(', ')
new_items = [mapping.get(i, i) for i in items]
return [', '.join(new_items)]
df['Article'] = df['Article'].map(convert_string)
CodePudding user response:
I would use a regex and str.replace
here:
aDict = {'111-2': 'Text1', '358-1': 'Text 2'}
import re
pattern = '|'.join(map(re.escape, aDict))
df['Article'] = df['Article'].str.replace(pattern, lambda m: aDict[m.group()], regex=True)
NB. If the dictionary keys can overlap (ab/abc), then they should be sorted by decreasing length to generate the pattern.
Output:
Article
37774 186-2, 185-3, 185-2
37850 Text 2, 358-4
37927
38266 Text1
38409 Text1
38508
38519 185-1
41161 185-4, 357-1
42948 185-1