df2.loc[(df2['feature'] == 0), 'package_loss'] =1
My code is above. Here, I am trying changing a value to the column 'package_loss' to 1 if another column equals 0.
CodePudding user response:
Use dask.dataframe.DataFrame.where
:
df2['package_loss'].where((df2['feature'] == 0), df2['package_loss'], 1).compute()
CodePudding user response:
This is not as terse as @jezrael's answer, but allows more flexible transformations using pandas
syntax:
from dask.datasets import timeseries
def add_col(df):
df = df.copy()
mask = df["name"] == "Dan"
df["new_column"] = 0
df.loc[mask, "new_column"] = 1
return df
df = timeseries()
df2=df.map_partitions(add_col)
df2.head()