Home > Back-end >  Returning Unique Responses from different data frames in R
Returning Unique Responses from different data frames in R

Time:07-15

I have three different data frames containing questions from a survey given over three years. Over the years some of the questions have been edited slightly. I would like to create a new data frame which tells me which questions have been changed and gives me the wording of the question from all three years.

The data frames look something like:

2019 <- data.frame(V1 = c("Q1","Q2), V2= c("How many times have you done this?", "Is this your first time?"))
2020 <- data.frame(V1 = c("Q1","Q2), V2= c("How many times have you done this? (before this time)", "Is this your first time?"))``
2021 <- data.frame(V1 = c("Q1","Q2), V2= c("How many times have you done this( before this time)?", "Is this your first time?"))

I would like it to return:

data.frame(V1 = c("Q1"),2019 = c("How many times have you done this?"), 2020 = c("How many times have you done this? (before this time)"), 2021 = c("How many times have you done this( before this time)?"))

My first thought was to use something like anti_join() but I don't know how that works for three data frames.

CodePudding user response:

Row bind the data frames into a single frame (adding a year identifier (y)), group by Question (V1), and filter the groups where there are more than one unique wording of (V2):

dt %>% 
  group_by(V1) %>% 
  filter(length(unique(V2))>1)

Output:

      y V1    V2                                                   
  <dbl> <chr> <chr>                                                
1  2019 Q1    How many times have you done this?                   
2  2020 Q1    How many times have you done this? (before this time)
3  2021 Q1    How many times have you done this( before this time)?

Input:

dt = rbind(
  data.frame(y=2019,V1 = c("Q1","Q2"), V2= c("How many times have you done this?", "Is this your first time?")),
  data.frame(y=2020,V1 = c("Q1","Q2"), V2= c("How many times have you done this? (before this time)", "Is this your first time?")),
  data.frame(y=2021,V1 = c("Q1","Q2"), V2= c("How many times have you done this( before this time)?", "Is this your first time?"))
)
  • Related