Right now I have two data tables in the form:
library(dplyr)
> lengths
chrom p q
1 2560 78934
2 1346 9087
3 678 7809
4 5467 1253
...
> my_data
mut pos chr arm
A 567 2 p
B 6890 2 q
C 978 3 q
D 4689 4 p
...
I want to add another column my_data$chrom.pos
, that is conditionally adding values from lengths$q
to the my_data$pos
values IF my_data$arm
is 'p', but not 'q'.
What I have thus far:
my_data %>% mutate(chrom.pos =
case_when (arm == "q" ~ pos,
arm == "p" ~ pos lengths[["q"]][chr])
)
While this handles the 'q' condition just fine, any case that is 'p' comes out as NA because I'm not getting the linking quite right.
> my_data
mut pos chr arm chrom.pos
A 567 2 p NA # Looking for 9087 567 = 9654
B 6890 2 q 6890
C 978 3 q 978
D 4689 4 p NA # Looking for 1253 4689 = 5942
CodePudding user response:
When I am running your code, I get the following results:
lengths <- data.frame(chrom = c(1,2,3,4),
p = c(2560,1346,678,5467),
q = c(78934,9087,7809,1253))
my_data <- data.frame(mut = c("A", "B", "C", "D"),
pos = c(567,6890,978,4689),
chr = c(2,2,3,4),
arm = c("p", "q", "q", "p"))
library(tidyverse)
my_data %>% mutate(chrom.pos =
case_when (arm == "q" ~ pos,
arm == "p" ~ pos lengths[["q"]][chr])
)
Output:
mut pos chr arm chrom.pos
1 A 567 2 p 9654
2 B 6890 2 q 6890
3 C 978 3 q 978
4 D 4689 4 p 5942
As you can see, it gives the desired output.
CodePudding user response:
One reason this couldn't work could be because of leading/lagging spaces in the 'arm' column values. We can remove those spaces with trimws
and it should work
library(dplyr)
my_data %>%
mutate(pos = trimws(pos),
chrom.pos =
case_when (arm == "q" ~ pos,
arm == "p" ~ pos lengths[["q"]][chr])
)
CodePudding user response:
You can merge two data with chr
and chrom
as keys.
library(dplyr)
my_data %>%
left_join(select(lengths, chrom, q), by = c("chr" = "chrom")) %>%
mutate(chrom.pos = if_else(arm == "p", pos q, pos)) %>%
select(-q)
# mut pos chr arm chrom.pos
# 1 A 567 2 p 9654
# 2 B 6890 2 q 6890
# 3 C 978 3 q 978
# 4 D 4689 4 p 5942