Home > other >  Moving observations from different groups into the same rows in R?
Moving observations from different groups into the same rows in R?

Time:11-20

I am currently struggling with moving data from a function calculation in R. I have 8 columns that have calculated a certain y value at the same point under different conditions (type), and my output looks like this:

ydata <- output %>% select(x, type, y_a,y_b,y_c,y_d,y_e,y_f,y_g,y_h)
ydata
#> x  type   y_a  y_b   y_c    ...
#> 1  1     1.3  <NA>  <NA>   
#> 2  1     2.7  <NA>  <NA>   
#> 3  1     4.4  <NA>  <NA>  
#> 1  2     <NA>  2.2   <NA>    
#> 2  2     <NA>  3.3   <NA>    
#> 3  2     <NA>  4.4   <NA>    
#> 1  3     <NA>  <NA>  3.3    
#> 2  3     <NA>  <NA>  7.6
#> 3  3     <NA>  <NA>  11.3
...

However, my desired output would look like this, which I am unsure how to produce:

#> x  y_a  y_b   y_c    ...
#> 1  1.3  2.2  3.3   
#> 2  2.7  3.3  7.6   
#> 3  4.4  4.4  11.3  
...

I've tried using summarize() to group the variables together by x, but doing so would not shift the column's data in the desired way so that the <NA> values would not appear in the data frame.

CodePudding user response:

Based on the example provided this may work

library(dplyr)

df %>% 
  group_by(x) %>% 
  summarize(across(contains("_"), ~ .x[.x != "<NA>"]))
# A tibble: 3 × 4
      x y_a   y_b   y_c
  <int> <chr> <chr> <chr>
1     1 1.3   2.2   3.3
2     2 2.7   3.3   7.6
3     3 4.4   4.4   11.3

Data

df <- structure(list(x = c(1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L), type = c(1L,
1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L), y_a = c("1.3", "2.7", "4.4",
"<NA>", "<NA>", "<NA>", "<NA>", "<NA>", "<NA>"), y_b = c("<NA>",
"<NA>", "<NA>", "2.2", "3.3", "4.4", "<NA>", "<NA>", "<NA>"),
    y_c = c("<NA>", "<NA>", "<NA>", "<NA>", "<NA>", "<NA>", "3.3",
    "7.6", "11.3")), class = "data.frame", row.names = c(NA,
-9L))

CodePudding user response:

This is a result of not constructing the y_a, y_b ... columns in the most efficient manner.

You can fix it doing this:

pivot_longer(df,cols = -(x:type)) %>% 
  filter(value!="<NA>") %>% 
  pivot_wider(x)

Output:

      x y_a   y_b   y_c  
  <int> <chr> <chr> <chr>
1     1 1.3   2.2   3.3  
2     2 2.7   3.3   7.6  
3     3 4.4   4.4   11.3 

or in a slightly less verbose way:

pivot_wider(filter(pivot_longer(df,cols = -(x:type)), value!="<NA>"),x)
  • Related