Home > Blockchain >  ValueError when trying to write a for loop in python
ValueError when trying to write a for loop in python

Time:12-03

When I run this:

import pandas as pd

data = {'id': ['earn', 'earn','lose', 'earn'],
    'game': ['darts', 'balloons', 'balloons', 'darts']
    }

df = pd.DataFrame(data)
print(df)
print(df.loc[[1],['id']] == 'earn')

The output is:
id game
0 earn darts
1 earn balloons
2 lose balloons
3 earn darts
id
1 True

But when I try to run this loop:

for i in range(len(df)):  
     if (df.loc[[i],['id']] == 'earn'):  
         print('yes')  
     else:  
         print('no')

I get the error 'ValueError: The truth value of a DataFrame is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().' I am not sure what the problem is. Any help or advice is appreciated -- I am just starting.

I expected the output to be 'yes' from the loop. But I just got the 'ValueError' message. But, when I run the condition by itself, the output is 'True' so I'm not sure what is wrong.

CodePudding user response:

for i,row in df.iterrows():
    if row.id == "earn":
        print("yes")

CodePudding user response:

Its complicated. pandas is geared towards operating on entire groups of data, not individual cells. df.loc may create a new DataFrame, a Series or a single value, depending on how its indexed. And those produce DataFrame, Series or scalar results for the == comparison.

If the indexers are both lists, you get a new DataFrame and the compare is also a dataframe

>>> foo = df.loc[[1], ['id']]
>>> type(foo)
<class 'pandas.core.frame.DataFrame'>
>>> foo
     id
1  earn
>>> foo == "earn"
     id
1  True

If one indexer is scalar, you get a new Series

>>> foo = df.loc[[1], 'id']
>>> type(foo)
<class 'pandas.core.series.Series'>
>>> foo
1    earn
Name: id, dtype: object
>>> foo == 'earn'
1    True
Name: id, dtype: bool

If both indexers are scalar, you get a single cell's value

>>> foo = df.loc[1, 'id']
>>> type(foo)
<class 'str'>
>>> foo
'earn'
>>> foo == 'earn'
True

That last is the one you want. The first two produce containers where True is ambiguous (you need to decide if any or all values need to be True).

for i in range(len(df)):  
     if (df.loc[i,'id'] == 'earn'):  
         print('yes')  
     else:  
         print('no')

Or maybe not. Depending on what you intend to do next, create a series of boolean values for all of the rows at once

>>> earn = df[id'] == 'earn'
>>> earn
0     True
1     True
2    False
3     True
Name: id, dtype: bool

now you can continue to make calculations on the dataframe as a whole.

  • Related