Home > front end >  Reshape the data from three columns to two column
Reshape the data from three columns to two column

Time:09-23

I have a data like this :

   Response Prompt Prompt.Label Comments
1    Friend    who          si3      3_a
2         1    was       si4_en      3_a
3         1    was      si4_nag      3_a
4    Family    who          si3      3_b
5         1    was      si4_nag      3_b
6         1    was       si4_en      3_b
7     other    who          si3      4_a
8         1    was       si4_sp      4_a
9     other    who          si3      4_b
10        1    was       si4_en      4_b
11   Friend    who          si3      5_a
12        1    was       si4_en      5_a
13   Family    who          si3      5_b
14        1    was       si4_en      5_b
data<-structure(list(Response = c("Friend", "1", "1", "Family", "1", 
"1", "other", "1", "other", "1", "Friend", "1", "Family", "1"
), Prompt = c("who", "was", "was", "who", "was", "was", "who", 
"was", "who", "was", "who", "was", "who", "was"), Prompt.Label = c("si3", 
"si4_en", "si4_nag", "si3", "si4_nag", "si4_en", "si3", "si4_sp", 
"si3", "si4_en", "si3", "si4_en", "si3", "si4_en"), Comments = c("3_a", 
"3_a", "3_a", "3_b", "3_b", "3_b", "4_a", "4_a", "4_b", "4_b", 
"5_a", "5_a", "5_b", "5_b")), class = "data.frame", row.names = c(NA, 
-14L))

I am looking to reshape the data to have two columns, including Source, and Target. "Source" should have the values from "Response" when the "Prompt"=="who", and "Target" should be the values of "Prompt.Label" if "Prompt"= "was" for each value of the "Comments" columns. For instance, for the first row, Friend is going for "Source" , and si4_en is for "Target".

The desire data should be :

  Source  Target
1 Friend  si4_en
2 Friend si4_nag
3 Family si4_nag
4 Family  si4_en
5  other  si4_sp
6  other  si4_en
7 Friend  si4_en
8 Family  si4_en

CodePudding user response:

Here is one way to get the desired output:

library(dplyr)
data %>% 
  group_by(cumsum(Response!=1)) %>% 
  mutate(Response = ifelse(Response==1, first(Response), Response)) %>% 
  filter(row_number() !=1) %>%
  ungroup() %>% 
  select(Response, Target=Prompt.Label)
  Response Target 
  <chr>    <chr>  
1 Friend   si4_en 
2 Friend   si4_nag
3 Family   si4_nag
4 Family   si4_en 
5 other    si4_sp 
6 other    si4_en 
7 Friend   si4_en 
8 Family   si4_en

CodePudding user response:

I feel we can do it with some very simple tidyverse functions, as long as we do some basic data wrangling beforehand. This data.frame is not tidy. The Response variable is not self-explicable, and needs some contextual clues for understanding. I feel we may be better off by first tidying it up with tidyr::fill. Then we can do some simple select/filter.

data %>%
    mutate(Response = na_if(Response, 1)) %>%
    fill(Response) %>%
    filter(Prompt == 'was') %>%
    select(Source = Response, Target = Prompt.Label)

  Source  Target
1 Friend  si4_en
2 Friend si4_nag
3 Family si4_nag
4 Family  si4_en
5  other  si4_sp
6  other  si4_en
7 Friend  si4_en
8 Family  si4_en
  • Related