I want to replace values in different columns. Namely, if they contain a "Y", then with the value "Y" and if they contain an "N", then with the value "N".
Since I do this for many columns, the code looks very redundant and I was wondering if I can outsource the always similar procedure (only the column name changes) to a method.
Unfortunately I have no idea how to pass the column name as parameter to the method I write then. Is this possible? I am a Python beginner and have never seen this before!
Thanks a lot for your help!
df = pd.DataFrame({'A': ['J', 'Ja', 'N', 'N n'],
'B': ['Nö', 'Ja', 'N', 'N n'],
'C': ['Jup', 'Ja', 'N', 'N n']})
df.loc[df['A'].str.contains('J'), 'A'] = 'J'
df.loc[df['A'].str.contains('N'), 'A'] = 'N'
df.loc[df['B'].str.contains('J'), 'B'] = 'J'
df.loc[df['B'].str.contains('N'), 'B'] = 'N'
df.loc[df['C'].str.contains('J'), 'C'] = 'J'
df.loc[df['C'].str.contains('N'), 'C'] = 'N'
def clear_values(column_namestr):
column_name = [column_namestr]
print(type(column_name))
#df.loc[df[column_name].str.contains('X'), 'spaltennamen'] = 'X'
#df.loc[df[column_name].str.contains('o'), 'spaltennamen'] = 'o'
column_namestr = 'a'
df2 = clear_values(column_namestr)
But does not work...
CodePudding user response:
If you want to replicate
df.loc[df['C'].str.contains('J'), 'C'] = 'J'
df.loc[df['C'].str.contains('N'), 'C'] = 'N'
with a function you can do the following
def clear_values(name1, name2):
df.loc[df[name1].str.contains(name2), name1] = name2
clear_values('C', 'J')
clear_values('C', 'N')
If you only want to change the column you can do this:
def clear_values(column):
df.loc[df[column].str.contains('J'), column] = 'J'
df.loc[df[column].str.contains('N'), column] = 'N'
clear_values('B')
clear_values('C')