Home > Blockchain >  How to perform operation over each dictionary in column?
How to perform operation over each dictionary in column?

Time:11-16

If I have a column in dataframe with dictionaries:

col1
{10:24, 7:3}
{5:24, 1:2, 7:8}
{1:1}

How to perform operation of extracting keys from each dictionary for each rows? So I need to get:

col1
10, 7
5, 1, 7
1

How to do that? this df["col1"] = df["col1"].keys() doesnt work and I don't know why

CodePudding user response:

DataFrame has .keys() to get own indexes, not to get keys from dictionares in cells.

But you can use .apply() to run function on every elemement in column separatelly.

df['col1'] = df['col1'].apply(lambda item: item.keys())

Minimal working example:

import pandas as pd

df = pd.DataFrame({'col1':[
   {10:24, 7:3},
   {5:24, 1:2, 7:8},
   {1:1},
]})


df['col1'] = df['col1'].apply(lambda item: item.keys())

print(df)

Result (now it has tuples with numbers):

        col1
0    (10, 7)
1  (5, 1, 7)
2        (1)

BTW:

DataFrame has special method to work with strings .str which may work also with list/tuples and some of them even with dictionary

It can't get df['col1'].str.keys() because string doesn't have keys but if you use df['col1'].str[10] then you get from all dictionares elements which have key 10

0    24.0
1     NaN
2     NaN

CodePudding user response:

df["col1"] is not a dictionary - it is a tuple. That explains why you get an AttributeError. You need to iterate over each row in the dataframe column and call keys() on it.

df['col1'] = [row.keys() for row in df["col1"]]

CodePudding user response:

DataFrame.apply according to the documentation:

Apply a function along an axis of the DataFrame.

Luckily, the default axis is columns, not rows. You are going for a single column, so make your applied function check the current column.

df.apply(lambda c: c.keys() if c.name == "col1" else c)
  • Related