Yesterday I was wondering how to access to dictionary keys in a DataFrame column (link). The solution was to use .str[<key>]
on the pandas Series and call the tolist()
method afterwards. However this does not work with object attributes.
My goal is to get as an output a list of a specific attributes for each object in a panda Series.
Here is a sample code with the solution I am working with. I cast the whole object Series as a list and then iterate over it to get the specific attribute. Is there a way to access directly the attribute ?
class User:
def __init__(self, name):
self.name = name
df = pd.DataFrame({
'col1': [User("Juan"), User("Karen"), User("Vince")]
})
myObjects = df['col1'].tolist()
myNames = [u.name for u in myObjects]
# Desired output
['Juan', 'Karen', 'Vince']
And when I try the dictionary solution :
myNames = df["col1"].str['name'].tolist()
# Output
[nan, nan, nan]
CodePudding user response:
You can use the attrgetter
from operator
library in combination with pandas.Series.map
. This will map your inputs using the attrgetter
, which returns a function that when called on the entries of col1
retrieve object attributes named name
. Equivalent to lambda x: x.name
from operator import attrgetter
myNames = df["col1"].map(attrgetter("name")).tolist()
Output:
['Juan', 'Karen', 'Vince']
CodePudding user response:
I would not recommend your method as it only works if you change the class. Alternatively, you can use apply()
for this:
myNames = list(df['col1'].apply(lambda x: x.name))
List:
['Juan', 'Karen', 'Vince']
The str
method works only dictionaries, but not on objects. If you make your object convertible to a dictionary it would work. For example like this:
class User:
def __init__(self, name):
self.name = name
def __iter__(self):
yield 'name', self.name
df = pd.DataFrame({
'col1': [User("Juan"), User("Karen"), User("Vince")]
})
result = list(df['col1'].map(dict).str['name'])
CodePudding user response:
You can also try:
df['col1'].map(lambda x: x.name).to_list()