Column with list values, if list has more than 1 element, decode both values and include in a list i-CodePudding

I have two columns, cat_1 and cat_2, that have gibberish for values in them. This is because I will be referencing a python library to derive info from those gibberish values.

If a list has more than 1 element in a row, whether in cat_1 or cat_2, I want to go ahead and fill the row of a new column, results_1 or results_2, with its values separated by a comma. Examples shown below.

How can I get from cat_1 and cat_2... to results_1 and results_2?

CodePudding user response：

You need some kind of mapper dictionary that tell us which gibberish word is which English word. Something like:

mapper = {'klys': 'apple', 'chrs': 'orange', 'pqor': 'bag'}

Given such a mapper, you can:

df[['results_1','results_2']] = df[['cat_1','cat_2']].stack().str.split(', ').explode().map(mapper).fillna('').groupby(level=[0,1]).apply(lambda x: list(x) if len(x)>1 else (x.values[0] if len(x)==1 else '')).unstack()