I have the following dataframe:
df = pd.DataFrame({'y': [4,3,6,1], 'x': [0,0,2,1]})
I would like to compute the ratio of the two columns. However, since there are some 0 in the denominator, I would like to fix if with a if else
statement. The condition should be: if in the denominator there is 0, replace it with 1 and compute the ratio, otherwise as usual.
I did this (and other variations) but it doesn't work (see below error):
if df['x'] == 0:
df['x'] = 1
df['ratio'] = df['y']/df['x']
else:
df['ratio'] = df['y']/df['x']
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
Can anyone help me?
Thanks!
CodePudding user response:
The issue here is that df['x'] is actually a pandas series, it's not being evaluated for each individual value, but as a single element (pd Series) itself, and since it contains multiple values, the evaluation is returning an error because it's ambiguous.
For a fast and efficient solution without the need of apply, you can use np.where()
. So that:
df['ratio'] = np.where(df['x'] == 0,df['y'],df['y']/df['x'])
Outputs:
y x ratio
0 4 0 4.0
1 3 0 3.0
2 6 2 3.0
3 1 1 1.0
CodePudding user response:
If you wanna make it simple and use if/else
import pandas as pd
df = pd.DataFrame({'y': [4,3,6,1], 'x': [0,0,2,1]})
if 0 in df['x']:
df['x'].replace(0, 1, inplace=True)
df['ratio'] = df['y']/df['x']
print(df)
Here the output
y x ratio
0 4 1 4.0
1 3 1 3.0
2 6 2 3.0
3 1 1 1.0
CodePudding user response:
You can do this:
def x(a,b):
if b == 0:
b = 1
return a / b
df['ratio'] = df.apply(lambda f: x(f['y'],f['x']), axis=1)