**Using Pandas 1.4.2, Python 3.9.12
I have a data set where the column values are represented as 0 or 1 which stand for 'No' and 'Yes', respectively.
Scholarship Hipertension Diabetes Alcoholism SMS_received
0 0 1 0 0 0
1 0 0 0 0 0
2 0 0 0 0 0
3 0 0 0 0 0
4 0 1 1 0 0
I am attempting to create a custom function to replace the 0's and 1's all at once with 'No' and 'Yes', respectively.
What I have written at this point is as follows:
def replace_values(data_frame, column, being_replaced, replacement_value):
data_frame[column] = df[column].replace(to_replace=being_replaced, value=
replacement_value)
return df
As an example, I would like to be able to put all the column names in and the values being replaced and replacement values so the function will do everything in one fell swoop. Such as:
replace_values(df, [*list_of_columns*], [0, 1], ['No', 'Yes'])
Is this even possible? Do I need to put a loop in there as well? I have tried it a couple times with only one column name as opposed to a list and it works, but it replaces every 0 and 1 with 'No' and 'Yes' regardless of column, which is great, but not what I am trying to do. Any help is appreciated.
CodePudding user response:
here is a couple of solutions.
to use replace:
df.replace({1: 'Yes', 0: 'No'})
use where, which keeps the value that fulfills the condition of the first argument and changes everything else to the value of the second argument:
df = df.where(df == 1, 'No')
df = df.where(df == 'No', 'Yes')
use boolean masking:
df[df == 0] = 'No'
df[df == 1] = 'Yes'
CodePudding user response:
This should work for you:
def replace_values(data_frame):
return data_frame.astype(bool)
or since you want to be able to specify the column names you can try something like this:
def replace_values(data_frame, list_of_columns):
for col in list_of_columns:
data_frame[col] = data_frame[col].astype(bool)
return data_frame