I have a dataframe that look like:
corpus zero_level_name time labels A B C
0 ff f 1 1
1 gg g G
2 hh h H 1 1 1
3 ii i I
4 jj j J 1
I want to add 0 to all the empty cells from columns A to C. Is it possible to do this in one goal?
CodePudding user response:
Assuming you have either NaNs or empty strings in your DataFrame, you can use:
df.update(df.loc[:, 'A':'C'].replace('', 0).fillna(0))
NB. there is no output, the DataFrame is modified in place
Also note that changing the values does not change the dtypes. If you need integers, rather run:
cols = df.loc[:, 'A':'C'].columns
df[cols] = df[cols].replace('', 0).fillna(0).astype(int)
Updated df
:
corpus zero_level_name time labels A B C
0 ff f 1 1 0
1 gg g G 0 0 0
2 hh h H 1 1 1
3 ii i I 0 0 0
4 jj j J 1 0 0
If you only have empty strings:
df.update(df.loc[:, 'A':'C'].replace('', 0))
Or only NaNs:
df.update(df.loc[:, 'A':'C'].fillna(0))
CodePudding user response:
So there's probably a better way to do this - but the first thing that comes to mind is:
pd.concat([df[['corpus', 'zero_level_name', 'time', 'labels']],df[['A','B','C']].fillna(0)], axis=1)
I think that gets what you're looking for (the other columns as is, A->C fill blanks with 0 and get it all as one df)
CodePudding user response:
import numpy as np
import pandas as pd
df = pd.DataFrame(
data=np.array([
['ff', 'gg', 'hh', 'ii', 'jj'],
[None, 'g', 'h', 'i', 'j'],
['f', 'G', 'H', 'I', 'J'],
[None, None, None, None, None],
[1, None, 1, None, 1],
[1, None, 1, None, None],
[None, None, 1, None, None]
]).T,
columns=['corpus', 'zero_level_name', 'time', 'labels', 'A', 'B', 'C'],
)
df[['A', 'B', 'C']] = df[['A', 'B', 'C']].fillna(0)
CodePudding user response:
Select the relevant columns and apply a mask.
cols = ['A', 'B', 'C']
df[cols] = df[cols].mask(df[cols] == '', 0)