I'm really new to python codes and having trouble with applying some of the answers for similar questions for my own case. Please help
So I have a dataframe with Column A and B, with numerous rows
Both contains negative and positive numbers, and I'm trying to make a new Column C with following conditions
If "Value of row 1, Column A is less than 0" & "Value of row 1, Column B is less than 0", return -100 in "row 1, Column C"
Elif "Value of row 1, Column A is less than 0" & "Value of row 1, Column B is greater than 0", return 100 in "row 1, Column C"
Elif "Value of row 1, Column A is greater than 0" & "Value of row 1, Column B is less than 0", return 100 in "row 1, Column C"
Else : return (Column A.Value / Column B.Value) in Column C
Thanks a lot
CodePudding user response:
I think you are looking for np.select
:
condlist = [(df['A'] < 0) & (df['B'] < 0),
(df['A'] < 0) & (df['B'] > 0),
(df['A'] > 0) & (df['B'] < 0)]
choicelist = [-100, 100, 100]
default = df['A'] / df['B']
df['C'] = np.select(condlist, choicelist, default)
Output:
>>> df
A B C
0 -0.002639 -1.374507 -100.000000
1 -0.696428 9.923431 100.000000
2 1.410547 3.804043 0.370802
3 1.504908 2.701486 0.557067
4 1.867486 1.889067 0.988576
5 -0.451066 -11.529716 -100.000000
6 5.713800 -7.678271 100.000000
7 -4.318760 5.082725 100.000000
8 5.169819 -4.122461 100.000000
9 0.094524 -1.916718 100.000000
Setup a MRE
import pandas as pd
import numpy as np
np.random.seed(2022)
df = pd.DataFrame(np.random.normal(0, 5, (10, 2)), columns=['A', 'B'])
CodePudding user response:
import pandas as pd
# initialize as list
data = [[1, 10], [3, 45], [56, -6], [-96, -65]]
# Create the pandas DataFrame
df = pd.DataFrame(data, columns = ['A', 'B'])
print(df)
print('*****************')
df['C'] = None
if df.iat[0,0] < 0 and df.iat[0,1] < 0:
df.iat[0,2] = -100
elif df.iat[0,0] < 0 and df.iat[0,1] > 0:
df.iat[0,2] = 100
elif df.iat[0,0] > 0 and df.iat[0,1] < 0:
df.iat[0,2] = 100
else:
df['C'] = df['A']/df['B']
print('New dataFrame:')
print(df)
OUTPUT:
A B
0 1 10
1 3 45
2 56 -6
3 -96 -65
*****************
New dataFrame:
A B C
0 1 10 0.100000
1 3 45 0.066667
2 56 -6 -9.333333
3 -96 -65 1.476923
Let me know if any confusion!
CodePudding user response:
One way you could do this is by defining a function with the desired logic and then passing it to the apply
function along axis 1 (row-wise). In this case, that function might look like the following:
def f(x):
if x["A"] < 0 and x["B"] < 0:
return -100
elif x["A"] < 0 and x["B"] > 0:
return 100
elif x["A"] > 0 and x["B"] < 0:
return 100
else:
return x["A"] / x["B"]
We can then generate some sample data for testing purposes:
>>> import numpy as np
>>> import pandas as pd
>>> data = np.random.randint(-50, 50, size = (10, 2))
>>> df = pd.DataFrame(data, columns = ["A", "B"])
>>> df
A B
0 23 4
1 10 25
2 -14 45
3 31 32
4 49 32
5 -23 34
6 -10 -29
7 10 -19
8 -45 -48
9 31 -31
Finally, we can apply our function to the sample data:
>>> df["C"] = df.apply(f, axis = 1)
>>> df
A B C
0 23 4 5.75000
1 10 25 0.40000
2 -14 45 100.00000
3 31 32 0.96875
4 49 32 1.53125
5 -23 34 100.00000
6 -10 -29 -100.00000
7 10 -19 100.00000
8 -45 -48 -100.00000
9 31 -31 100.00000