I have a dataframe. How can I replace multiple unknown values with default value except nan in python dataframe column.
df = S.No. Columns_A
1 python
2 java
3 NAN
4 C
5 python , java
How to get updated data frame
df_updated = S.No. Columns_A
1 Good
2 Good
3 NAN
4 Good
5 Good
CodePudding user response:
How about this:
import pandas as pd
import numpy as np
df = pd.DataFrame(data={
'S.No.':[
1,
2,
3,
4,
5,
],
'ColumnA':[
'python',
'java',
np.nan,
'C ',
'python , java',
]
})
df['ColumnA'] = df.apply(lambda row: np.nan if pd.isna(row['ColumnA']) else 'Good', axis=1)
result:
S.No. ColumnA
0 1 Good
1 2 Good
2 3 NaN
3 4 Good
4 5 Good
CodePudding user response:
You easily need to select not Na
values by .loc
and make it 'Good'
like this:
df.loc[~ df.ColumnA.isna()] = 'Good'
import pandas as pd
df = pd.DataFrame(data={
'ColumnA':[
'python',
'java',
None,
'C ',
'python , java',
]
})
df.loc[~ df.ColumnA.isna()] = 'Good'
df
ColumnA
0 Good
1 Good
2 NaN
3 Good
4 Good
CodePudding user response:
Use df.where
: This should be faster than other solutions
In [1443]: df['Columns_A'] = df['Columns_A'].where(df['Columns_A'].isna(), 'Good')
In [1444]: df
Out[1444]:
S.No. Columns_A
0 1 Good
1 2 Good
2 3 NaN
3 4 Good
4 5 Good