Home > Net >  Convert dictionaries in pandas dataframe to list
Convert dictionaries in pandas dataframe to list

Time:01-03

I have a dataframe

fruit1                   fruit2
[banana,apple,orange]    [apple,nuts,strawberry]
[apple,mango,grape]      [apple,mango,grape,guava]



My code for adding the two additional columns is


df["fruits_added"] = df.apply(lambda row: set(row.fruit2) - set(row.fruit1), axis=1)
df["fruits_deleted"] = df.apply(lambda row: set(row.fruit1) - set(row.fruit2), axis=1)

My desired output is

fruit1                   fruit2                       fruits_added         fruits_deleted
[banana,apple,orange]    [apple,nuts,strawberry]    [strawberry,nuts]     [banana,orange]
[apple,mango,grape]      [apple,mango,grape,guava]    [guava]               []



but I am getting dictionaries instead

fruit1                   fruit2                       fruits_added         fruits_deleted
[banana,apple,orange]    [apple,nuts,strawberry]      {strawberry,nuts}   {banana,orange}
[apple,mango,grape]      [apple,mango,grape,guava]    {guava}              {}



Any input is appreciated

CodePudding user response:

Convert sets to lists:

df["fruits_added"] = df.apply(lambda row: list(set(row.fruit2) - set(row.fruit1)), axis=1)
df["fruits_deleted"] = df.apply(lambda row: list(set(row.fruit1) - set(row.fruit2)), axis=1)

Alternative solution:

zipped = zip(df['fruit1'], df['fruit2'])
df["fruits_added"] = [list(set(y) - set(x)) for x, y in zipped]
df["fruits_deleted"] = [list(set(x) - set(y)) for x, y in zipped]

CodePudding user response:

You can use np.setdiff1d

df['fruits_deleted'] = df.apply(lambda x: np.setdiff1d(x.fruit1, x.fruit2), axis=1)
df['fruits_added'] = df.apply(lambda x: np.setdiff1d(x.fruit2, x.fruit1), axis=1)
  • Related