I wonder how can we fill the NaNs from all columns of a dataframe, except some. For example, I have a dataframe with 20 columns, I want to fill the NaN for all except two columns (in my case, NaN are replaced by the mean).
df = df.drop(['col1','col2], 1).fillna(df.mean())
I tried this, but I don't think it's the best way to achieve this (also, i want to avoid the inplace=true arg).
Thank's
CodePudding user response:
You can select which columns to use fillna
on. Assuming you have 20 columns and you want to fill all of them except 'col1' and 'col2' you can create a list with the ones you want to fill:
f = [c for c in df.columns if c not in ['col1','col2']]
df[f] = df[f].fillna(df[f].mean())
print(df)
col1 col2 col3 col4 ... col17 col18 col19 col20
0 1.0 1.0 1.000000 1.0 ... 1.000000 1 1.000000 1
1 NaN NaN 2.666667 2.0 ... 2.000000 2 2.000000 2
2 NaN 3.0 3.000000 1.5 ... 2.333333 3 2.333333 3
3 4.0 4.0 4.000000 1.5 ... 4.000000 4 4.000000 4
(2.66666) was the mean
# Initial DF:
{'col1': {0: 1.0, 1: nan, 2: nan, 3: 4.0},
'col2': {0: 1.0, 1: nan, 2: 3.0, 3: 4.0},
'col3': {0: 1.0, 1: nan, 2: 3.0, 3: 4.0},
'col4': {0: 1.0, 1: 2.0, 2: nan, 3: nan},
'col5': {0: 1.0, 1: nan, 2: 3.0, 3: 4.0},
'col6': {0: 1, 1: 2, 2: 3, 3: 4},
'col7': {0: nan, 1: 2.0, 2: 3.0, 3: 4.0},
'col8': {0: 1.0, 1: 2.0, 2: nan, 3: 4.0},
'col9': {0: 1, 1: 2, 2: 3, 3: 4},
'col10': {0: 1.0, 1: 2.0, 2: nan, 3: 4.0},
'col11': {0: 1.0, 1: nan, 2: 3.0, 3: 4.0},
'col12': {0: 1, 1: 2, 2: 3, 3: 4},
'col13': {0: 1.0, 1: 2.0, 2: nan, 3: 4.0},
'col14': {0: 1.0, 1: nan, 2: 3.0, 3: 4.0},
'col15': {0: 1, 1: 2, 2: 3, 3: 4},
'col16': {0: 1.0, 1: nan, 2: 3.0, 3: nan},
'col17': {0: 1.0, 1: 2.0, 2: nan, 3: 4.0},
'col18': {0: 1, 1: 2, 2: 3, 3: 4},
'col19': {0: 1.0, 1: 2.0, 2: nan, 3: 4.0},
'col20': {0: 1, 1: 2, 2: 3, 3: 4}}