I have a dataframe with a column of ids, but for some rows there are multiple ids concatenated together. I want to merge this onto another dataframe using the id, and when the ids are concatenated it handles that and reflects it by having the values in the new columns added also concatenated.
For example I have dataframes
data <- data.frame(
id = c(1, 4, 3, "2,3", "1,4"),
value = c(1:5)
)
> data
id value
1 1 1
2 4 2
3 3 3
4 2,3 4
5 1,4 5
mapping <- data.frame(
id = 1:4,
name = c("one", "two", "three", "four")
)
> mapping
id name
1 1 one
2 2 two
3 3 three
4 4 four
I would like to end up with
id value name
1 1 1 one
2 4 2 four
3 3 3 three
4 2,3 4 two,three
5 1,4 5 one,four
CodePudding user response:
I don't think there's a good way to do this other than to separate, join, and re-concatenate:
library(dplyr)
library(tidyr)
data %>%
mutate(true_id = row_number()) %>%
separate_rows(id, convert = TRUE) %>%
left_join(mapping, by = "id") %>%
group_by(true_id, value) %>%
summarize(id = toString(id), name = toString(name), .groups = "drop")
# # A tibble: 5 × 4
# true_id value id name
# <int> <int> <chr> <chr>
# 1 1 1 1 one
# 2 2 2 4 four
# 3 3 3 3 three
# 4 4 4 2, 3 two, three
# 5 5 5 1, 4 one, four
I wasn't sure if your value
column would actually be unique, so I added a true_id
just in case.
CodePudding user response:
What about something like this. I could think of a few ways. One is longer, but much easier to follow and the other is short, but kind of a mess.
library(tidyverse)
#long and readable
data |>
mutate(tmp = row_number()) |>
mutate(id = str_split(id, ",")) |>
unnest_longer(id) |>
left_join(mapping |>
mutate(id = as.character(id)), by = "id") |>
group_by(tmp) |>
summarise(id = paste(id, collapse = ","),
value = value[1],
name = paste(name, collapse = ","))
#> # A tibble: 5 x 4
#> tmp id value name
#> <int> <chr> <int> <chr>
#> 1 1 1 1 one
#> 2 2 4 2 four
#> 3 3 3 3 three
#> 4 4 2,3 4 two,three
#> 5 5 1,4 5 one,four
#short and ugly
data |>
mutate(name = map_chr(id, \(x)paste(
mapping$name[which(as.character(mapping$id) %in% str_split(x, ",")[[1]])],
collapse = ",") ))
#> id value name
#> 1 1 1 one
#> 2 4 2 four
#> 3 3 3 three
#> 4 2,3 4 two,three
#> 5 1,4 5 one,four
CodePudding user response:
grep
ing the data$id
s out of the mapping$id
s.
mapply(\(x, y) toString(mapping$name[grep(sprintf('[%s]', gsub('\\D', '', x)), y)]),
data$id, list(mapping$id))
# 1 4 3 2,3 1,4
# "one" "four" "three" "two, three" "one, four"
In order not to have a space after the comma, use paste(., collapse=',')
instead of toString
.