I have two columns, cat_1 and cat_2, that have gibberish for values in them. This is because I will be referencing a python library to derive info from those gibberish values.
If a list has more than 1 element in a row, whether in cat_1 or cat_2, I want to go ahead and fill the row of a new column, results_1 or results_2, with its values separated by a comma. Examples shown below.
How can I get from cat_1 and cat_2... to results_1 and results_2?
CodePudding user response:
You need some kind of mapper dictionary that tell us which gibberish word is which English word. Something like:
mapper = {'klys': 'apple', 'chrs': 'orange', 'pqor': 'bag'}
Given such a mapper, you can:
df[['results_1','results_2']] = df[['cat_1','cat_2']].stack().str.split(', ').explode().map(mapper).fillna('').groupby(level=[0,1]).apply(lambda x: list(x) if len(x)>1 else (x.values[0] if len(x)==1 else '')).unstack()