I've been experimenting with using lambda through the .apply() method.
I noticed that you can manipulate individual values in each row, and through different expressions, as with this df and manipulation:
test = pd.DataFrame({'credit_score': [608, 675, 502, 699, 850], 'age': [42, 41, 42, 39, 43]}, index=['i1','i2','i3','i4','i5'])
result1 = test.apply(lambda x: (x.i1 *2, x.i2 * 3))
result1
which outputs:
But I haven't been able to do something similar across columns instead of rows.
For example, I thought I might be able to do so with:
result2 = test.apply(lambda x: (x.credit_score *2, x.age * 3), axis=1)
result2
But this approach outputs tuples in the non-index column:
Is there a way to preserve the columns and modify the values vertically downward from the columns, instead of horizontally from rows? Basically I'm trying to understand how it might be possible to do more comprehensive calculations from a single code line—though I'd also be interested in other workarounds.
CodePudding user response:
You are doing two different operations in your example...
In the first case you are applying on the first and second rows only lambda x: (x.i1 *2, x.i2 * 3)
. i1
and i2
are index of the rows
Second operation you are applying to the first and second columns (NOT rows) hence you get two rows of output
I assume you were looking for something like this?
credit_score age new_column
i1 608 42 1216
i2 675 41 1350
i3 502 42 1004
i4 699 39 1398
i5 850 43 1700
This will give you the above ouput
test['new_column']=test.apply(lambda x: x.credit_score *2, axis=1)
Updated: The OP was looking for apply
to return results to multiple columns
test[["ex1","ex2"]] = test.apply(lambda x: [x.credit_score *2, x.age * 3], axis=1, result_type="expand")