So i have the following sample df
df = pd.DataFrame({'Id':[1,1,2,3],'Origin':['int','int','pot','pot'],'Origin2':['pot','int','int','int']})
Id Origin Origin2
0 1 int pot
1 1 int int
2 2 pot int
3 3 pot int
And i do the following replace command
df.loc[df['Id'].eq(1)].apply(lambda x : x.replace('int':np.nan))
How could i update the original df, with both new columns using indexes.
I tried df.update
but after checking the documentation, i noticed it doesnt substitute 'non na' values by nan values?
For better understanding, the columns in the index [0,1] ('id'= 1). Substitute the string 'int' by np.nan
Wanted result:
df = pd.DataFrame({'Id':[1,1,2,3],'Origin':[np.nan,np.nan,'pot','pot'],'Origin2':['pot',np.nan,'int','int']})
Id Origin Origin2
0 1 NaN pot
1 1 NaN NaN
2 2 pot int
3 3 pot int
CodePudding user response:
Use mixed boolean/label indexing:
m = df['Id'].eq(1)
df.loc[m, ['Origin', 'Origin2']] = df.loc[m, ['Origin', 'Origin2']].replace('int', np.nan)
Output:
Id Origin Origin2
0 1 NaN pot
1 1 NaN NaN
2 2 pot int
3 3 pot int
CodePudding user response:
mask = df.Id.eq(1)
cols = ['Origin','Origin2']
df.loc[mask, cols] = df.replace('int', np.nan)
Output:
Id Origin Origin2
0 1 NaN pot
1 1 NaN NaN
2 2 pot int
3 3 pot int
CodePudding user response:
This might be more ugly than you want but it's all I have:
# Recreate DF plus an extra column to check the code
df = pd.DataFrame({'Id':[1,1,2,3],'Origin':['int','int','pot','pot'],'Origin2':['pot','int','int','int'], 'Test':['pot','pot','int','int']})
# Relevant Code
index = [0,1] # Known index that you want to change 'int' to 'NaN'
columns = ["Origin", "Origin2"] # Columns you want to change `int` to `NaN`
for column in columns:
df[column][index] = df[column][index].replace('int', np.nan)
Output:
Id Origin Origin2 Test
0 1 NaN pot pot
1 1 NaN NaN pot
2 2 pot int int
3 3 pot int int
CodePudding user response:
This is bit different from another answer because I'm not using Id
column here.
df = pd.DataFrame({'Id':[1,1,2,3],'Origin':['int','int','pot','pot'],'Origin2':['pot','pot','int','int']})
df['Origin'] = df.apply(lambda x: np.nan if x['Origin'] == 'int' else 'int', axis = 'columns')
df['Origin2'] = df.apply(lambda x: np.nan if x['Origin2'] == 'int' else 'int', axis = 'columns'
Id Origin Origin2
0 1 NaN pot
1 1 NaN NaN
2 2 pot int
3 3 pot int