Doing the following conditional fill in pyspark how would I do this in pandas
colIsAcceptable = when(col("var") < 0.9, 1).otherwise(0)
CodePudding user response:
You can use:
df['new_col'] = df['col'].lt(0.9).astype(int)
or with numpy.where
:
import numpy as np
df['new_col'] = np.where(df['col'].lt(0.9), 1, 0)
CodePudding user response:
You can use numpy.where
.
import numpy as np
df['colIsAcceptable'] = np.where(df['col'] < 0.9, 1, 0)
CodePudding user response:
colIsAcceptable = df['var'].apply(lambda x: 1 if x < 0.9 else 0)
apply can be slow on very large datasets, and there are more efficient ways that I don't know of, but is good for general purposes
CodePudding user response:
df['col2'] = 0
df.loc['col1' < 0.9, 'col2'] = 1
This is a simple example to do something like what you are asking.
CodePudding user response:
I assume the first column on your dataframe is named 'var'. and then the second column name is 'colIsAcceptable', then you can use .map()
function
df['colIsAcceptable']= df['var'].map(lambda x: 1 if x<0.9 else 0)