Home > OS >  how to change the value of one column based on another column value in Dask Dataframe
how to change the value of one column based on another column value in Dask Dataframe

Time:04-09

I have a huge Dataframe that I'm reading using Dask dataFrame. In pandas I use,

df.loc[df['Ref']!='ABC','Ref2'] = np.nan

Then I frontfill the changed column as shown below,

df['Ref2'] = df['Ref2'].fillna(method = 'ffill')

for making a change in a column based on condition on another column value.

How can the same be achieved using Dask Dataframe?

I'm new to Dask Dataframe

CodePudding user response:

Use dask.dataframe.Series.mask and dask.dataframe.Series.fillna:

df['Ref2'] = df['Ref2'].mask(df['Ref']!='ABC').fillna(method = 'ffill')

CodePudding user response:

A different way to write this (closer to the pandas syntax):

mask = df['Ref']!='ABC'
df.loc[mask,'Ref2'] = np.nan
df['Ref2'] = df['Ref2'].fillna(method = 'ffill')

dask closely follows pandas syntax, so often the pandas expression will work.

  • Related