I am currently struggling with moving data from a function calculation in R. I have 8 columns that have calculated a certain y value at the same point under different conditions (type), and my output looks like this:
ydata <- output %>% select(x, type, y_a,y_b,y_c,y_d,y_e,y_f,y_g,y_h)
ydata
#> x type y_a y_b y_c ...
#> 1 1 1.3 <NA> <NA>
#> 2 1 2.7 <NA> <NA>
#> 3 1 4.4 <NA> <NA>
#> 1 2 <NA> 2.2 <NA>
#> 2 2 <NA> 3.3 <NA>
#> 3 2 <NA> 4.4 <NA>
#> 1 3 <NA> <NA> 3.3
#> 2 3 <NA> <NA> 7.6
#> 3 3 <NA> <NA> 11.3
...
However, my desired output would look like this, which I am unsure how to produce:
#> x y_a y_b y_c ...
#> 1 1.3 2.2 3.3
#> 2 2.7 3.3 7.6
#> 3 4.4 4.4 11.3
...
I've tried using summarize() to group the variables together by x, but doing so would not shift the column's data in the desired way so that the <NA> values would not appear in the data frame.
CodePudding user response:
Based on the example provided this may work
library(dplyr)
df %>%
group_by(x) %>%
summarize(across(contains("_"), ~ .x[.x != "<NA>"]))
# A tibble: 3 × 4
x y_a y_b y_c
<int> <chr> <chr> <chr>
1 1 1.3 2.2 3.3
2 2 2.7 3.3 7.6
3 3 4.4 4.4 11.3
Data
df <- structure(list(x = c(1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L), type = c(1L,
1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L), y_a = c("1.3", "2.7", "4.4",
"<NA>", "<NA>", "<NA>", "<NA>", "<NA>", "<NA>"), y_b = c("<NA>",
"<NA>", "<NA>", "2.2", "3.3", "4.4", "<NA>", "<NA>", "<NA>"),
y_c = c("<NA>", "<NA>", "<NA>", "<NA>", "<NA>", "<NA>", "3.3",
"7.6", "11.3")), class = "data.frame", row.names = c(NA,
-9L))
CodePudding user response:
This is a result of not constructing the y_a
, y_b
... columns in the most efficient manner.
You can fix it doing this:
pivot_longer(df,cols = -(x:type)) %>%
filter(value!="<NA>") %>%
pivot_wider(x)
Output:
x y_a y_b y_c
<int> <chr> <chr> <chr>
1 1 1.3 2.2 3.3
2 2 2.7 3.3 7.6
3 3 4.4 4.4 11.3
or in a slightly less verbose way:
pivot_wider(filter(pivot_longer(df,cols = -(x:type)), value!="<NA>"),x)