Home > Blockchain >  replace NaNs with 0 for df columns where column name contains specific string (pandas)
replace NaNs with 0 for df columns where column name contains specific string (pandas)

Time:11-14

I have a dataframe as a result of a pivot which has several thousand columns (representing time-boxed attributes). Below is a much shortened version for resemblance.

d = {'incount - 14:00': [1,'NaN', 1,1,'NaN','NaN','NaN','NaN',1],
     'incount - 15:00': [2,1,2,'NaN','NaN','NaN',1,4,'NaN'],
     'outcount - 14:00':[2,'NaN',1,1,1,1,2,2,1]
'outcount - 15:00':[2,2,1,1,'NaN',2,'NaN',1,1]}
df = pd.DataFrame(data=d)

I want to replace the NaNs in columns that contain "incount" with 0 (leaving other columns untouched). I have tried the following but predictably it does not recognise the column name.

df['incount'] = df_all['incount'].fillna(0)

I need the ability to search the column names and only impact those containing a defined string.

CodePudding user response:

you can use:

loop_cols = list(df.columns[df.columns.str.contains('incount',na=False)]) #get columns containing incount as a list

#or
#loop_cols = [col for col in df.columns if 'incount' in col]
print(loop_cols)
'''
['incount - 14:00', 'incount - 15:00']
'''
for i in loop_cols:
    df[i]=df[i].fillna(0)

CodePudding user response:

try this:

m = df.columns[df.columns.str.startswith('incount')]
df.loc[:, m] = df.loc[:, m].fillna(0)
print(df)
  • Related