sorry for adding the screenshot, I download data from https://www.kaggle.com/datasets/rikdifos/credit-card-approval-prediction
Can someone inform me about the way to fill those NA values that the occupation column has? I create a new variable to determine whether an applicant is working or not and I want to fill NA values as zero if the same observation is zero in is_working column and left the others NA.
df <- data.frame (occupation = c("NA","NA","Drivers","Accountants","NA","Drivers","Laborers","Cleaning staff","Drivers","Drivers"),
is_working = c("1","0","1","1","1","1","1","1","1","1")
)
In short, if the value is zero in is_working column, I want to make the NA value in occupation zero. If the value is 1 in is_working, I want to assign "other" to the NA value in occupation.
CodePudding user response:
library(dplyr)
df %>%
mutate(
# change string "NA" to missing values NA
occupation = ifelse(occupation == "NA", NA, occupation),
# replace NAs where is_working is 0 with 0
occupation = ifelse(is.na(occupation) & is_working == 0, "0", occupation)
)
# occupation is_working
# 1 <NA> 1
# 2 0 0
# 3 Drivers 1
# 4 Accountants 1
# 5 <NA> 1
# 6 Drivers 1
# 7 Laborers 1
# 8 Cleaning staff 1
# 9 Drivers 1
# 10 Drivers 1