For example, I have the DataFrame:
a = [{'name': 'A', 'col_1': 5, 'col_2': 3, 'col_3': 1.5},
{'name': 'B', 'col_1': 4, 'col_2': 2.5, 'col_3': None},
{'name': 'C', 'col_1': 8, 'col_2': None, 'col_3': None},
{'name': 'D', 'col_1': 7, 'col_2': 9, 'col_3': None}]
df = pd.DataFrame(a)
df['col_1'] = df['col_1'].fillna(0)
df['col_2'] = df['col_2'].fillna(0)
df['col_3'] = df['col_3'].fillna(0)
print(df)
I'm trying to calculate the values for the df['color_4']
column, and I'm trying to do it in one line of code. But maybe it is impossible.
The calculation logic is as follows, and it works for df['name'] == 'A' and 'B'
, but not for df['name'] == 'C'
and should be supplemented
df['col_4'] = [i if i != 0 else x - (x * 0.75) for i, x, y in zip(df['col_3'], df['col_2'], df['col_1'])]
It is necessary to continue the calculation, if the value is in the df['col_3'] and df['col_2'] == 0
, then y - (y * 0.75)
-> df['col_1']
In the case of df['name'] == 'D'
, it is first necessary to compare the values in the columns df['col_1'] and df['col_2']
and choose the minimum value
I need the next result:
CodePudding user response:
You can simplify you logic to:
- get the min of col_1/col_2
- if col_3 is 0, use 0.25 * that min value
- else use col_3
df['col_4'] = np.where(df['col_3'].eq(0),
df[['col_1', 'col_2']].min(axis=1).mul(0.25),
df['col_3'])
Output:
name col_1 col_2 col_3 col_4
0 A 5 3.0 1.5 1.500
1 B 4 2.5 0.0 0.625
2 C 8 0.0 0.0 0.000
3 D 7 9.0 0.0 1.750
CodePudding user response:
You could also use Pandas operations, but I let it in the style you want :
df['col_4'] = [t[0] if t[0]>0 else
(min(t[1], t[2]) - (min(t[1], t[2]) * 0.75) if bool(t[1]) else t[2]-0.75*t[2])
for t in zip(df['col_3'], df['col_2'], df['col_1'])]
Complete script for checking
import pandas as pd
a = [{'name': 'A', 'col_1': 5, 'col_2': 3, 'col_3': 1.5},
{'name': 'B', 'col_1': 4, 'col_2': 2.5, 'col_3': None},
{'name': 'C', 'col_1': 8, 'col_2': None, 'col_3': None},
{'name': 'D', 'col_1': 7, 'col_2': 9, 'col_3': None}]
df = pd.DataFrame(a)
df['col_1'] = df['col_1'].fillna(0)
df['col_2'] = df['col_2'].fillna(0)
df['col_3'] = df['col_3'].fillna(0)
df['col_4'] = [t[0] if t[0]>0 else
(min(t[1], t[2]) - (min(t[1], t[2]) * 0.75) if bool(t[1]) else t[2]-0.75*t[2])
for t in zip(df['col_3'], df['col_2'], df['col_1'])]
print(df)
Result
name col_1 col_2 col_3 col_4
0 A 5 3.0 1.5 1.500
1 B 4 2.5 0.0 0.625
2 C 8 0.0 0.0 2.000
3 D 7 9.0 0.0 1.750