Home > database >  New columns in Pandas Loop with ZeroDivisionError exception
New columns in Pandas Loop with ZeroDivisionError exception

Time:09-14

I am trying to create some new columns in a dataframe which are ratios of existing columns:

df[e] = df[a]/df[b]
df[f] = df[c]/df[d]
df[g] = df[a]/df[d]
df[h] = df[b]/df[c]
...

Since some values in the columns are zeros, the code above raises the ZeroDivisionError. I tried to fix it manually with:

try:
    df[e] = df[a]/df[b]
except ZeroDivisionError:
    df[e] = np.nan
try:
    df[f] = df[c]/df[d]
except ZeroDivisionError:
    df[f] = np.nan
try:
    df[g] = df[a]/df[d]
except ZeroDivisionError:
    df[g] = np.nan
...

But with this code all the rows in the new columns are then np.nan instead of only those which would raise the ZeroDivisionError.

So, how could I do this correctly? Possibly while also using a for loop over the new columns without having to do it manually for each new column like I tried in the second code block.

Thank you very much!

CodePudding user response:

Pandas should not raise a ValueError upon division by zero but rather define the value as NaN/inf:

np.random.seed(42)
df = pd.DataFrame(np.random.choice(range(3), size=(5,4)), columns=list('abcd'))
df['e'] = df['a']/df['b']

output:

   a  b  c  d    e
0  2  0  2  2  inf
1  0  0  2  1  NaN
2  2  2  2  2  1.0
3  0  2  1  0  0.0
4  1  1  1  1  1.0

Not that you can also perform all computations in one shot:

np.random.seed(42)
df = pd.DataFrame(np.random.choice(range(3), size=(5,4)), columns=list('abcd'))

df.loc[:, ['e', 'f', 'g', 'h']] = df[['a', 'c', 'a', 'b']].div(df[['b', 'd', 'd', 'c']].values, axis=1).values

output:

   a  b  c  d    e    f    g    h
0  2  0  2  2  inf  1.0  1.0  0.0
1  0  0  2  1  NaN  2.0  0.0  0.0
2  2  2  2  2  1.0  1.0  1.0  1.0
3  0  2  1  0  0.0  inf  NaN  2.0
4  1  1  1  1  1.0  1.0  1.0  1.0

CodePudding user response:

You can try by iterating over each single element like this:

df[e] = [a/b if b else 0 for a,b in zip(df[a],df[b])
  • Related