I have these data frames and I want to merge them with left_join, based on the peak column. However, any time that I am trying I am taking NA values
can you help me why?
library(tidyverse)
df1 <- tibble(peak=c("peak1","peak2","peak3"),
coord1=c(100,500,1000),
coord2=c(250,700,1250))
df1
#> # A tibble: 3 × 3
#> peak coord1 coord2
#> <chr> <dbl> <dbl>
#> 1 peak1 100 250
#> 2 peak2 500 700
#> 3 peak3 1000 1250
df2 <- tibble(peak=c("peak5","peak6","peak7"),
coord1=c(120,280,600),
coord2=c(300,400,850))
df2
#> # A tibble: 3 × 3
#> peak coord1 coord2
#> <chr> <dbl> <dbl>
#> 1 peak5 120 300
#> 2 peak6 280 400
#> 3 peak7 600 850
dplyr::left_join(df1, df2, by="peak")
#> # A tibble: 3 × 5
#> peak coord1.x coord2.x coord1.y coord2.y
#> <chr> <dbl> <dbl> <dbl> <dbl>
#> 1 peak1 100 250 NA NA
#> 2 peak2 500 700 NA NA
#> 3 peak3 1000 1250 NA NA
Created on 2022-12-04 with reprex v2.0.2
CodePudding user response:
Assuming that your data is the same as the previous question.
data <- list(df1, df2, df3)
> data
[[1]]
# A tibble: 3 × 3
peak coord1 coord2
<chr> <dbl> <dbl>
1 peak1 100 250
2 peak2 500 700
3 peak3 1000 1250
[[2]]
# A tibble: 3 × 3
peak coord1 coord2
<chr> <dbl> <dbl>
1 peak5 120 300
2 peak6 280 400
3 peak7 900 1850
[[3]]
# A tibble: 3 × 3
peak coord1 coord2
<chr> <dbl> <dbl>
1 peak8 900 2000
2 peak9 3000 3400
3 peak10 5600 5850
map(data, ~ .x %>%
mutate(peak = str_c("peak", 1:nrow(.)))) %>%
reduce(left_join, by = "peak")
# A tibble: 3 × 7
peak coord1.x coord2.x coord1.y coord2.y coord1 coord2
<chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 peak1 100 250 120 300 900 2000
2 peak2 500 700 280 400 3000 3400
3 peak3 1000 1250 900 1850 5600 5850