Home > Mobile >  Modify dataframe in place using nan values from passed dataframe
Modify dataframe in place using nan values from passed dataframe

Time:07-23

So i have the following sample df

df = pd.DataFrame({'Id':[1,1,2,3],'Origin':['int','int','pot','pot'],'Origin2':['pot','int','int','int']})

   Id Origin Origin2
0   1    int     pot
1   1    int     int
2   2    pot     int
3   3    pot     int

And i do the following replace command

df.loc[df['Id'].eq(1)].apply(lambda x : x.replace('int':np.nan))

How could i update the original df, with both new columns using indexes. I tried df.update but after checking the documentation, i noticed it doesnt substitute 'non na' values by nan values?
For better understanding, the columns in the index [0,1] ('id'= 1). Substitute the string 'int' by np.nan

Wanted result:

df = pd.DataFrame({'Id':[1,1,2,3],'Origin':[np.nan,np.nan,'pot','pot'],'Origin2':['pot',np.nan,'int','int']})

   Id Origin Origin2
0   1    NaN     pot
1   1    NaN     NaN
2   2    pot     int
3   3    pot     int

CodePudding user response:

Use mixed boolean/label indexing:

m = df['Id'].eq(1)

df.loc[m, ['Origin', 'Origin2']] = df.loc[m, ['Origin', 'Origin2']].replace('int', np.nan)

Output:

   Id Origin Origin2
0   1    NaN     pot
1   1    NaN     NaN
2   2    pot     int
3   3    pot     int

CodePudding user response:

mask = df.Id.eq(1)
cols = ['Origin','Origin2']
df.loc[mask, cols] = df.replace('int', np.nan)

Output:

   Id Origin Origin2
0   1    NaN     pot
1   1    NaN     NaN
2   2    pot     int
3   3    pot     int

CodePudding user response:

This might be more ugly than you want but it's all I have:

#  Recreate DF plus an extra column to check the code
df = pd.DataFrame({'Id':[1,1,2,3],'Origin':['int','int','pot','pot'],'Origin2':['pot','int','int','int'], 'Test':['pot','pot','int','int']})

# Relevant Code
index = [0,1]                         # Known index that you want to change 'int' to 'NaN'
columns = ["Origin", "Origin2"]       # Columns you want to change `int` to `NaN`
for column in columns:
    df[column][index] = df[column][index].replace('int', np.nan)

Output:

    Id  Origin  Origin2     Test
0   1   NaN     pot         pot
1   1   NaN     NaN         pot
2   2   pot     int         int
3   3   pot     int         int

CodePudding user response:

This is bit different from another answer because I'm not using Id column here.

df = pd.DataFrame({'Id':[1,1,2,3],'Origin':['int','int','pot','pot'],'Origin2':['pot','pot','int','int']})

    df['Origin'] = df.apply(lambda x: np.nan if x['Origin'] == 'int' else 'int', axis = 'columns')
df['Origin2'] = df.apply(lambda x: np.nan if x['Origin2'] == 'int' else 'int', axis = 'columns'

    Id Origin Origin2
0   1    NaN     pot
1   1    NaN     NaN
2   2    pot     int
3   3    pot     int
  • Related