Home > Net >  Fill in a column that has false NAs from multiple choice question
Fill in a column that has false NAs from multiple choice question

Time:09-21

I have a dataset that is a choose all that apply question, which is then followed by question that asks which of the following options is the most difficult for you. The issue I'm running into here is that if a respondent only chose one option in the check all that apply section they aren't asked the following question. Leaving the following column filled with false NA values of people that only chose the one option.

So is there an easy way to scan across each row for these columns and see if respondents only chose one option, and if so add it to the second question column? Example of the issue is in row 4 of the dataset where a respondent chose "Managing healthcare activities through different online channels has been time-consuming.", but in the Q24a column there is an NA.

current structure and desired structure: The highlighted rows have only one response and so weren't added to Q24a, I'd like to make a column like New Q24a on the right side that has those answers included. enter image description here

dput output of the first 10 rows:

structure(list(Q24_1 = c("I have been concerned with the security of my money.", 
"I have been concerned with the security of my money.", "0", 
"0", "0", "0", NA, "0", "0", "0"), Q24_2 = c("I have been concerned with the safety of my medical records.", 
"I have been concerned with the safety of my medical records.", 
"0", "0", "I have been concerned with the safety of my medical records.", 
"0", NA, "I have been concerned with the safety of my medical records.", 
"0", "I have been concerned with the safety of my medical records."
), Q24_3 = c("Managing healthcare activities through different online channels has been time-consuming.", 
"0", "0", "Managing healthcare activities through different online channels has been time-consuming.", 
"0", "0", NA, "0", "0", "0"), Q24_4 = c("It has been difficult to contact service providers.", 
"0", "0", "0", "0", "0", NA, "0", "0", "0"), Q24_5 = c("It has been difficult to dispute charges.", 
"0", "0", "0", "It has been difficult to dispute charges.", "0", 
NA, "It has been difficult to dispute charges.", "0", "0"), Q24_6 = c("Using different online channels to manage healthcare activities has been a complicated task.", 
"Using different online channels to manage healthcare activities has been a complicated task.", 
"0", "0", "0", "0", NA, "Using different online channels to manage healthcare activities has been a complicated task.", 
"Using different online channels to manage healthcare activities has been a complicated task.", 
"0"), Q24_7 = c("It has been challenging to keep track of multiple bills.", 
"0", "0", "0", "0", "0", NA, "0", "0", "0"), Q24_8 = c("It has been difficult to follow appointments through different online channels.", 
"0", "0", "0", "It has been difficult to follow appointments through different online channels.", 
"0", NA, "0", "0", "0"), Q24_9 = c("I have been unwilling to share my medical information through different online channels.", 
"0", "0", "0", "0", "0", NA, "0", "0", "0"), Q24_10 = c("Other, please specify:", 
"0", "0", "0", "0", "0", NA, "0", "0", "0"), Q24_11 = c("I have not experienced any of these difficulties or problems when carrying out any action related to healthcare in the last 12 months.", 
"0", "I have not experienced any of these difficulties or problems when carrying out any action related to healthcare in the last 12 months.", 
"0", "0", "I have not experienced any of these difficulties or problems when carrying out any action related to healthcare in the last 12 months.", 
NA, "0", "0", "0"), Q24a = c("Q24a", "Using different online channels to manage healthcare activities has been a complicated task.", 
NA, NA, "It has been difficult to dispute charges.", NA, NA, 
"Using different online channels to manage healthcare activities has been a complicated task.", 
NA, NA)), row.names = c(NA, -10L), class = c("tbl_df", "tbl", 
"data.frame"))

CodePudding user response:

Try these:

dplyr

library(dplyr)
survey %>%
  rowwise() %>%
  mutate(
    Q24a = if_else(is.na(Q24a),
                   setdiff(na.omit(c_across(Q24_1:Q24_11)), "0")[1],
                   Q24a)
  ) %>%
  ungroup() %>%
  select(Q24_1:Q24_3, Q24a)
# # A tibble: 10 x 4
#    Q24_1                                                Q24_2                                                        Q24_3                                                                                     Q24a                                                       
#    <chr>                                                <chr>                                                        <chr>                                                                                     <chr>                                                      
#  1 I have been concerned with the security of my money. I have been concerned with the safety of my medical records. Managing healthcare activities through different online channels has been time-consuming. Q24a                                                       
#  2 I have been concerned with the security of my money. I have been concerned with the safety of my medical records. 0                                                                                         Using different online channels to manage healthcare activ~
#  3 0                                                    0                                                            0                                                                                         I have not experienced any of these difficulties or proble~
#  4 0                                                    0                                                            Managing healthcare activities through different online channels has been time-consuming. Managing healthcare activities through different online ch~
#  5 0                                                    I have been concerned with the safety of my medical records. 0                                                                                         It has been difficult to dispute charges.                  
#  6 0                                                    0                                                            0                                                                                         I have not experienced any of these difficulties or proble~
#  7 NA                                                   NA                                                           NA                                                                                        NA                                                         
#  8 0                                                    I have been concerned with the safety of my medical records. 0                                                                                         Using different online channels to manage healthcare activ~
#  9 0                                                    0                                                            0                                                                                         Using different online channels to manage healthcare activ~
# 10 0                                                    I have been concerned with the safety of my medical records. 0                                                                                         I have been concerned with the safety of my medical record~

(The last select is purely for visualization here to reduce the width.)

base R

newQ24a <- apply(subset(survey, select = grep("Q24_", names(survey), value = TRUE)), 
                 1, function(z) setdiff(na.omit(z), "0")[1])
newQ24a
#  [1] "I have been concerned with the security of my money."                                                                                  
#  [2] "I have been concerned with the security of my money."                                                                                  
#  [3] "I have not experienced any of these difficulties or problems when carrying out any action related to healthcare in the last 12 months."
#  [4] "Managing healthcare activities through different online channels has been time-consuming."                                             
#  [5] "I have been concerned with the safety of my medical records."                                                                          
#  [6] "I have not experienced any of these difficulties or problems when carrying out any action related to healthcare in the last 12 months."
#  [7] NA                                                                                                                                      
#  [8] "I have been concerned with the safety of my medical records."                                                                          
#  [9] "Using different online channels to manage healthcare activities has been a complicated task."                                          
# [10] "I have been concerned with the safety of my medical records."                                                                          

survey$Q24a <- ifelse(is.na(survey$Q24a), newQ24a, survey$Q24a)
  • Related