I am looking at binding rows in R with its subset of column names.
Code
df1 <- data.frame(a = c(1:5), b = c(6:10))
df2 <- data.frame(subset_a = c(11:15), subset_b = c(16:20), f = LETTERS[1:5])
result <- dplyr::bind_rows(df1, df2)
result
The code above produces the results below, however, I am looking for the results with the merged rows specified in the expected output.
Current results
a b subset_a subset_b c
1 6 NA NA <NA>
2 7 NA NA <NA>
3 8 NA NA <NA>
4 9 NA NA <NA>
5 10 NA NA <NA>
NA NA 11 16 A
NA NA 12 17 B
NA NA 13 18 C
NA NA 14 19 D
NA NA 15 20 E
Expected results
a b c
1 6 NA
2 7 NA
3 8 NA
4 9 NA
5 10 NA
11 16 A
12 17 B
13 18 C
14 19 D
15 20 E
I believe it should be possible to do so using regular expressions in column names, but I'm not sure how.
Could someone please assist me in resolving this?
CodePudding user response:
Something like this would work
library(dplyr)
expected_results<-result %>%
mutate(a_new=if_else(is.na(a),subset_a,a)) %>%
mutate(b_new=if_else(is.na(b),subset_b,b)) %>%
select(a_new, b_new,f) %>%
rename(a=a_new, b=b_new,c=f)
expected_results
#> a b c
#> 1 1 6 <NA>
#> 2 2 7 <NA>
#> 3 3 8 <NA>
#> 4 4 9 <NA>
#> 5 5 10 <NA>
#> 6 11 16 A
#> 7 12 17 B
#> 8 13 18 C
#> 9 14 19 D
#> 10 15 20 E
Created on 2022-01-10 by the reprex package (v2.0.1)