Home > front end >  Update values of non NaN positions in a dataframe column
Update values of non NaN positions in a dataframe column

Time:07-11

I want to update the values of non-NaN entries in a dataframe column

import pandas as pd
from pprint import pprint
import numpy as np
d = {
    't': [0, 1, 2, 0, 2, 0, 1],
    'input': [2, 2, 2, 2, 2, 2, 4],
    'type': ['A', 'A', 'A', 'B', 'B', 'B', 'A'],
    'value': [0.1, 0.2, np.nan, np.nan, 2, 3, np.nan],
}
df = pd.DataFrame(d)

The data for updating the value column is in a list

new_value = [10, 15, 1, 18]

I could get the non-NaN entries in column value

df["value"].notnull() 

I'm not sure how to assign the new values.

Suggestions will be really helpful.

CodePudding user response:

df.loc[df["value"].notna(), 'value'] = new_value

By df["value"].notna() you select the rows where value is not NAN, then you specify the column (value in this case). It is important that the number of rows selected by the condition matches the number of values in new_value.

CodePudding user response:

You can first identify the index which have nan values.

import pandas as pd
from pprint import pprint
import numpy as np
d = {
    't': [0, 1, 2, 0, 2, 0, 1],
    'input': [2, 2, 2, 2, 2, 2, 4],
    'type': ['A', 'A', 'A', 'B', 'B', 'B', 'A'],
    'value': [0.1, 0.2, np.nan, np.nan, 2, 3, np.nan],
}
df = pd.DataFrame(d)
print(df)
r, _ = np.where(df.isna())
new_value = [10, 15, 18] # There are only 3 nans
df.loc[r,'value'] = new_value
print(df)

Output:

   t  input type  value
0  0      2    A    0.1
1  1      2    A    0.2
2  2      2    A   10.0
3  0      2    B   20.0
4  2      2    B    2.0
5  0      2    B    3.0
6  1      4    A   30.0
  • Related