If I have a column in dataframe with dictionaries:
col1
{10:24, 7:3}
{5:24, 1:2, 7:8}
{1:1}
How to perform operation of extracting keys from each dictionary for each rows? So I need to get:
col1
10, 7
5, 1, 7
1
How to do that? this df["col1"] = df["col1"].keys() doesnt work and I don't know why
CodePudding user response:
DataFrame
has .keys()
to get own indexes, not to get keys from dictionares in cells.
But you can use .apply()
to run function on every elemement in column separatelly.
df['col1'] = df['col1'].apply(lambda item: item.keys())
Minimal working example:
import pandas as pd
df = pd.DataFrame({'col1':[
{10:24, 7:3},
{5:24, 1:2, 7:8},
{1:1},
]})
df['col1'] = df['col1'].apply(lambda item: item.keys())
print(df)
Result (now it has tuples with numbers):
col1
0 (10, 7)
1 (5, 1, 7)
2 (1)
BTW:
DataFrame
has special method to work with strings .str
which may work also with list
/tuples
and some of them even with dictionary
It can't get df['col1'].str.keys()
because string
doesn't have keys
but if you use df['col1'].str[10]
then you get from all dictionares elements which have key 10
0 24.0
1 NaN
2 NaN
CodePudding user response:
df["col1"]
is not a dictionary - it is a tuple. That explains why you get an AttributeError
. You need to iterate over each row in the dataframe column and call keys()
on it.
df['col1'] = [row.keys() for row in df["col1"]]
CodePudding user response:
DataFrame.apply
according to the documentation:
Apply a function along an axis of the DataFrame.
Luckily, the default axis is columns, not rows. You are going for a single column, so make your applied function check the current column.
df.apply(lambda c: c.keys() if c.name == "col1" else c)