Similar to this question, I want to concatenate a subset of column values. However, I also want to prefix each value with the name of the column, separate name and value by ": ", and control the character used to separate name/value pairs.
For example, this gives me concatenation of the values:
library(data.table)
d <- data.table(x = c(1, 2), y = c(3, 4))
labs_to_get <- c("x", "y")
d[
,
.(x, y, labs = do.call(paste, c(.SD, sep = ", "))),
.SDcols = labs_to_get
]
#> x y labs
#> 1: 1 3 1, 3
#> 2: 2 4 2, 4
But I'm looking for this:
d[
,
.(x, y, labs = c("x: 1, y: 3", "x: 2, y: 4"))
]
#> x y labs
#> 1: 1 3 x: 1, y: 3
#> 2: 2 4 x: 2, y: 4
Created on 2022-04-01 by the reprex package (v2.0.1)
CodePudding user response:
We could paste
the corresponding names with Map
d[,
.(x, y, labs = do.call(paste, c(Map(function(u, v)
paste0(v, ": ", u), .SD, labs_to_get), sep = ", "))),
.SDcols = labs_to_get
]
-output
x y labs
<num> <num> <char>
1: 1 3 x: 1, y: 3
2: 2 4 x: 2, y: 4
or another option is write.dcf
d[, labs := do.call(paste,
c(as.list(setdiff(capture.output(write.dcf(.SD)), "")),
sep = ", ")), 1:nrow(d)]
> d
x y labs
<num> <num> <char>
1: 1 3 x: 1, y: 3
2: 2 4 x: 2, y: 4
Or use apply
to loop over the rows
d[, labs := apply(.SD, 1, \(x) paste(names(x), x, sep = ": ",
collapse = ", ")), .SDcols = labs_to_get]
Or using tidyverse
library(dplyr)
library(purrr)
library(stringr)
d %>%
mutate(labs = invoke(str_c, c(across(all_of(labs_to_get),
~str_c(cur_column(), ": ", .x)), sep = ", ")))
x y labs
<num> <num> <char>
1: 1 3 x: 1, y: 3
2: 2 4 x: 2, y: 4
CodePudding user response:
Here is an alternative tidyverse
approach:
library(dplyr)
library(tidyr)
labs_to_get <- c("x", "y")
d %>%
mutate(across(everything(), ~case_when(cur_column() %in% labs_to_get ~ paste(cur_column(), ., sep = ": "))
, .names = 'new_{col}')) %>%
unite(labs, starts_with('new'), na.rm = TRUE, sep = ', ')
x y labs
1: 1 3 x: 1, y: 3
2: 2 4 x: 2, y: 4
CodePudding user response:
How about just:
d[,labs:=paste0(labs_to_get,":",c(.SD), collapse=", "), by=1:nrow(d), .SDcols=labs_to_get]
Output:
x y labs
<num> <num> <char>
1: 1 3 x:1, y:3
2: 2 4 x:2, y:4