pandas conditionally fill values with 0 and 1-CodePudding

Doing the following conditional fill in pyspark how would I do this in pandas

colIsAcceptable = when(col("var") < 0.9, 1).otherwise(0)

CodePudding user response：

You can use:

df['new_col'] = df['col'].lt(0.9).astype(int)

import numpy as np
df['new_col'] = np.where(df['col'].lt(0.9), 1, 0)

CodePudding user response：

import numpy as np
df['colIsAcceptable'] = np.where(df['col'] < 0.9, 1, 0)

CodePudding user response：

colIsAcceptable = df['var'].apply(lambda x: 1 if x < 0.9 else 0)

apply can be slow on very large datasets, and there are more efficient ways that I don't know of, but is good for general purposes

CodePudding user response：

df['col2'] = 0
df.loc['col1' < 0.9, 'col2'] = 1

This is a simple example to do something like what you are asking.

CodePudding user response：

I assume the first column on your dataframe is named 'var'. and then the second column name is 'colIsAcceptable', then you can use .map() function


df['colIsAcceptable']= df['var'].map(lambda x: 1 if x<0.9 else 0)