Home > other >  Series.Items() Method Returns Zip Instead of Expected Output (Pandas)
Series.Items() Method Returns Zip Instead of Expected Output (Pandas)

Time:09-29

I am attempting to access the index of a series returned by the series.items() method in Pandas.

I was able to successfully do so when the series I generated was the result of a groupby:

df = pd.DataFrame( {
     'species': ['dog', 'cat', 'horse', 'dog', 'cat', 'horse'],
     'weight': [44.5, 12.3, 600.2, 37.3, 8.5, 405.9]
     } ) 
       
df.groupby(['species'])['weight'].sum().items()

species
dog          81.8
cat          20.8
horse        1006.1

This is the expected output, which allows me to access the index (species) of the series.

When I use the same method .items() on the following series generated with the .any() method, I do not get the same output.

df.isna().any().items()
<zip at 0x89529b3100> #The string here is random but approximate to what is seen.

I have verified that .any() produces a series through the pandas documentation, and have used the type() function to verify in both cases that each are of the series data type.

I am very confused why the .items() method does not return an output such as:

dog        False
cat        False
horse      False

I would greatly appreciate any guidance on what error I might be committing or what I am misunderstanding about Boolean series. Thank you.

CodePudding user response:

I think you might be mistaken about the use of .items(). It is used to return an iterable of name/series tuples that you could perhaps use in a for loop or otherwise. I recommend playing around with wrapping the result in list() to exhaust the iterable, just so you can get a better idea of what is going on:

$ list(df.groupby(['species'])['weight'].sum().items())

[('cat', 20.8), ('dog', 81.8), ('horse', 1006.1)]

Also, your first block of code is not showing the correct output, which may be confusing you. Rather it is showing the output of .sum(), so be careful there.

So to hopefully answer your questions, you have the following:

Viewing sum of weights grouped by species

$ df.groupby(['species'])['weight'].sum()

species
cat        20.8
dog        81.8
horse    1006.1
Name: weight, dtype: float64

Checking if any of these are NaN

$ df.groupby(['species'])['weight'].sum().isna()

species
cat      False
dog      False
horse    False
Name: weight, dtype: bool
  • Related