For example, I have tried:
df <- tibble(x = c(1,0,0,1), y = c(1,1,0,0), z = c(0,1,1,0))
df <- df %>% mutate(pos_seq = c(x,y,z))
#or
df <- df %>% rowwise() %>% mutate(pos_seq = c(x,y,z))
both of which give errors due to different sizes.
And I have tried:
df <- tibble(x = c(1,0,0,1), y = c(1,1,0,0), z = c(0,1,1,0))
df <- df %>% mutate(pos_seq = list(c(x,y,z)))
#or
df <- df %>% rowwise() %>% mutate(pos_seq = list(c(x,y,z)))
which makes pos_seq a list of the full x column, y column, z column, not just the single row values.
Same problem when I use a different way of 'aggregating' x/y/z e.g. mutate(pos_str = paste((x,y,z), collapse = ""))
. I'm not understanding why something like sum()
works on single row values, but other functions don't. Can I force it?
I want this result:
x | y | z | pos_seq | pos_str |
---|---|---|---|---|
1 | 1 | 0 | c(1,1,0) | "110" |
0 | 1 | 1 | c(0,1,1) | "011" |
0 | 0 | 1 | c(0,0,1) | "001" |
1 | 0 | 0 | c(1,0,0) | "100" |
In reality, I am wanting to run a complex function on a dataset that needs to take multiple variables from the row and use them, including characters and vectors, and some of these decisions rely on aggregates like "pos_seq" or "pos_str". But this demo seems to be the stem of my problems.
CodePudding user response:
You could use list
and c
in combination with rowwise
for your pos_seq column and use paste0
with collapse
to create one string of the values for your pos_str column like this:
library(dplyr)
df <- tibble(x = c(1,0,0,1), y = c(1,1,0,0), z = c(0,1,1,0))
df %>%
rowwise() %>%
mutate(pos_seq = list(c(x,y,z)),
pos_str = paste0(pos_seq, collapse = ""))
#> # A tibble: 4 × 5
#> # Rowwise:
#> x y z pos_seq pos_str
#> <dbl> <dbl> <dbl> <list> <chr>
#> 1 1 1 0 <dbl [3]> 110
#> 2 0 1 1 <dbl [3]> 011
#> 3 0 0 1 <dbl [3]> 001
#> 4 1 0 0 <dbl [3]> 100
Created on 2022-07-11 by the reprex package (v2.0.1)
CodePudding user response:
Her a one-liner that does the job.
cbind(df, list2DF(list(pos_seq=apply(df, 1, list))), pos_str=Reduce(paste0, df))
# x y z pos_seq pos_str
# 1 1 1 0 1, 1, 0 110
# 2 0 1 1 0, 1, 1 011
# 3 0 0 1 0, 0, 1 001
# 4 1 0 0 1, 0, 0 100