I have this dataset in pandas. How can I use the conditional statement 'if' to say that, for example -
if x_0.iloc[:, 0] > 2:
print('good')
47 -0.78690 9.5663
638 2.72130 7.0500
113 4.21880 6.8162
96 2.95430 1.0760
106 2.31360 10.6651
What I want to do is iterate through each variable in a column and run the conditional statement, but if I try it using .iloc or any other method, I get this error...
The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
How do I go about solving this? Thanks!
CodePudding user response:
for i in x_0.itertuples():
if x_0[0] > 2:
print('good')
Though this approach is fine, I would prefer a vectorized form to solve this as that is faster than a for loop.
CodePudding user response:
import pandas as pd
df = pd.DataFrame([[47,-0.78690,9.5663],
[638, 2.72130, 7.0500],
[113, 4.21880, 6.8162],
[96, 2.95430, 1.0760],
[106, 2.31360, 10.6651]])
# df[1] > 2 is a filter that will give all values of column 2 (index 1) that are > 2
# df[df[1] > 2] applies the filter to df, and gives you the filtered dataframe
# df[df[1] > 2][1] picks the 2nd column of the filtered df
# df[df[1] > 2][1].apply(lambda x: print(f'{x} good')) print the value and 'good' for each value.
df[df[1] > 2][1].apply(lambda x: print(f'{x} good'))
prints out:
2.7213 good
4.2188 good
2.9543 good
2.3136 good
CodePudding user response:
x_0.iloc[:, 0] will return a Series(a Pandas datatype or a column in Pandas), so, you can't use if
statement with it. You maybe use apply()
function or for
loop like other answers above.