mapping dictionary to pandas column when dictionary has list data type as value-CodePudding

Considering this sample df:

df = pd.DataFrame({'num': [1, 2, 3], 'colors': ['red', 'green', 'blue'], 'letters': ['person1, person2', '', '']})

   num  colors  letters
0   1   red     person1, person2
1   2   green   
2   3   blue

I am used to using .map to take a dictionary and map values to a column or new column. But, this is the twist on that. Here is the dictionary I am trying to map to the 'letters' column, but only applying it to rows where the column value is an empty string.

dict = {'red':['person1','person2'], 'green':['person3'], 'blue':['person5','person6']}

The desired result is:

   num  colors  letters
0   1   red     person1, person2
1   2   green   person3
2   3   blue    person5, person6

Tried various means of manipulating existing .map functions ending with this one and still, not getting either a single string or a string with both list values only where the value is empty.

df.loc[(df.letters== ''),'letters']=df.letters.map(lambda x: dict[x][1] if x in dict else '')

I am thinking some dictionary pandas sharp person out there has confronted this. Just cannot think my way around it beyond these .map attempts. Thanks for taking a look.

CodePudding user response：

Try this example (I've renamed dict to dct. dict is shadowing Python builtin):

dct = {k: ", ".join(v) for k, v in dct.items()}
m = df.letters.eq("")
df.loc[m, "letters"] = df.loc[m, "colors"].map(dct)

print(df)

Prints:

   num colors           letters
0    1    red  person1, person2
1    2  green           person3
2    3   blue  person5, person6

CodePudding user response：

Another option is to replace the empty strings with NaN, map the dictionary to the colors column, then use fillna:

df = df.replace('', np.nan)
df['letters'] = df.letters.fillna(df.colors.map(dict_))

(You'd still need to transform the lists from the dictionary into strings.)

CodePudding user response：

Just map and use str.join to rid of the corner brackets

df = df.assign(letters=df['colors'].map(dict).str.join(','))



num colors           letters
0    1    red  person1, person2
1    2  green           person3
2    3   blue  person5, person6