Home > front end >  How to calculate data using if/elif/else and minimum value in columns?
How to calculate data using if/elif/else and minimum value in columns?

Time:01-28

For example, I have the DataFrame:

a = [{'name': 'A', 'col_1': 5, 'col_2': 3, 'col_3': 1.5},
     {'name': 'B', 'col_1': 4, 'col_2': 2.5, 'col_3': None},
     {'name': 'C', 'col_1': 8, 'col_2': None, 'col_3': None},
     {'name': 'D', 'col_1': 7, 'col_2': 9, 'col_3': None}]
df = pd.DataFrame(a)
df['col_1'] = df['col_1'].fillna(0)
df['col_2'] = df['col_2'].fillna(0)
df['col_3'] = df['col_3'].fillna(0)
print(df)

I'm trying to calculate the values for the df['color_4'] column, and I'm trying to do it in one line of code. But maybe it is impossible.

The calculation logic is as follows, and it works for df['name'] == 'A' and 'B', but not for df['name'] == 'C' and should be supplemented

df['col_4'] = [i if i != 0 else x - (x * 0.75) for i, x, y in zip(df['col_3'], df['col_2'], df['col_1'])]

It is necessary to continue the calculation, if the value is in the df['col_3'] and df['col_2'] == 0, then y - (y * 0.75) -> df['col_1']

In the case of df['name'] == 'D', it is first necessary to compare the values in the columns df['col_1'] and df['col_2'] and choose the minimum value

I need the next result:

enter image description here

CodePudding user response:

You can simplify you logic to:

  • get the min of col_1/col_2
  • if col_3 is 0, use 0.25 * that min value
  • else use col_3
df['col_4'] = np.where(df['col_3'].eq(0),
                       df[['col_1', 'col_2']].min(axis=1).mul(0.25),
                       df['col_3'])

Output:

  name  col_1  col_2  col_3  col_4
0    A      5    3.0    1.5  1.500
1    B      4    2.5    0.0  0.625
2    C      8    0.0    0.0  0.000
3    D      7    9.0    0.0  1.750

CodePudding user response:

You could also use Pandas operations, but I let it in the style you want :

df['col_4'] = [t[0] if t[0]>0 else 
               (min(t[1], t[2]) - (min(t[1], t[2]) * 0.75) if bool(t[1]) else t[2]-0.75*t[2]) 
               for t in zip(df['col_3'], df['col_2'], df['col_1'])]

Complete script for checking

import pandas as pd

a = [{'name': 'A', 'col_1': 5, 'col_2': 3, 'col_3': 1.5},
     {'name': 'B', 'col_1': 4, 'col_2': 2.5, 'col_3': None},
     {'name': 'C', 'col_1': 8, 'col_2': None, 'col_3': None},
     {'name': 'D', 'col_1': 7, 'col_2': 9, 'col_3': None}]
df = pd.DataFrame(a)
df['col_1'] = df['col_1'].fillna(0)
df['col_2'] = df['col_2'].fillna(0)
df['col_3'] = df['col_3'].fillna(0)

df['col_4'] = [t[0] if t[0]>0 else 
               (min(t[1], t[2]) - (min(t[1], t[2]) * 0.75) if bool(t[1]) else t[2]-0.75*t[2]) 
               for t in zip(df['col_3'], df['col_2'], df['col_1'])]

print(df)

Result

  name  col_1  col_2  col_3  col_4
0    A      5    3.0    1.5  1.500
1    B      4    2.5    0.0  0.625
2    C      8    0.0    0.0  2.000
3    D      7    9.0    0.0  1.750
  • Related