Trying to subtract a constant array from a DatraFrame using lambda.
This is my DataFrame d
:
import pandas as pd
d = pd.DataFrame()
d['x'] = pd.Series([1, 2, 3, 4, 5, 6])
d['y'] = pd.Series([11, 22, 33, 44, 55, 66])
A working as expected classical loop approach:
transformed = pd.DataFrame(columns=('x', 'y'))
for index, row in d.iterrows():
transformed.loc[index] = [row[0] 5, row[1] 10]
print(transformed)
Produces:
x y
0 6 21
1 7 32
2 8 43
3 9 54
4 10 65
5 11 76
This is the lambda
version:
print(d.apply(lambda x: x [5, 10]))
However, is raising the error: ValueError: operands could not be broadcast together with shapes (6,) (2,)
After reading Pandas documentation, I understood my lambda approach should work. Why doesn't it work?
CodePudding user response:
If number of columns is same like length of list simpliest is:
print(d [5, 10])
x y
0 6 21
1 7 32
2 8 43
3 9 54
4 10 65
5 11 76
If there is multiple columns select by list, lengths of lists has to be same:
print(d[['x','y']] [5, 10])
CodePudding user response:
apply
is automatically column wise, the axis
argument is set to 0 by default.
You need to specify axis=1
for it will calculate row wise:
>>> d.apply(lambda x: x [5, 10], axis=1)
x y
0 6 21
1 7 32
2 8 43
3 9 54
4 10 65
5 11 76
>>>
But tbh in this situation you don't need apply
anyway:
>>> d [5, 10]
x y
0 6 21
1 7 32
2 8 43
3 9 54
4 10 65
5 11 76
>>>