I have a geodata frame. I have a column and I would like to create a new one subtracting one if that column is strictly greater than 0, otherwise, maintain the same value.
I have tried the following:
df['new_column'] = df.apply(lambda y: (df['old_column'].subtract(1)) if y['old_column'] > 0 else y['old_column'], axis=1)
It's doing well at the time to differentiate when old_column
is greater than 0, but at the moment to substract one, it's doing something strange, it's not substracting, it's just given a series of numbers, 3-2 2-1 1 1-1, things like that. Why is it doing that?
CodePudding user response:
The error is that you need to take one cell and not the entire column df['old_column'] => y['old_column']. In addition, there is no subtract method for a numpy object.
df['new_column'] = df.apply(lambda y: (y['old_column'] - 1) if y['old_column'] > 0 else y['old_column'], axis=1)
A simpler expression if data from one column is used
df['new_column'] = df['old_column'].apply(lambda y: y - 1 if y > 0 else y)
CodePudding user response:
Instead of apply
you can use np.where
which is faster for bigger dataframes and easier to read.
import numpy as np
import pandas as pd
df = pd.DataFrame({"old_column": [-3, -2, -1, 0, 1, 2, 3]})
df["new_column"] = np.where(df.old_column > 0, df.old_column-1, df.old_column)
df
old_column new_column
0 -3 -3
1 -2 -2
2 -1 -1
3 0 0
4 1 0
5 2 1
6 3 2
If this does not work for your df, please include an example