I have a nested list with multiple gene names in each sublist:
genes = list(c("her15.1", "her15.2", "her4.2", "her4.1", "dla"), c("pdyn", "cbln1", "kctd4", "sox1b" ), c("prph","phox2a", "phox2bb", "tac1", "slc18a3a"))
genes <- setNames(genes, c("a", "b", "c"))
I also have a df looking like this:
V1 V2
8694 ENSDARG00000008131 sox1b
8855 ENSDARG00000010791 dla
9408 ENSDARG00000068691 kctd4
14309 ENSDARG00000091029 phox2bb
15322 ENSDARG00000006356 slc18a3a
16000 ENSDARG00000057296 cbln1
17897 ENSDARG00000014490 tac1
19208 ENSDARG00000007406 phox2a
19593 ENSDARG00000056732 her4.1
19594 ENSDARG00000094426 her4.2
19975 ENSDARG00000087798 pdyn
22102 ENSDARG00000028306 prph
22717 ENSDARG00000054560 her15.1
Each item in the list is also in df$V2. I would like to replace each item in the list of lists with the corresponding item from df$V1 based on the matching in df$V2.
Thank you!
CodePudding user response:
I would create a translation vector with V1
as values and V2
as names and use it inside lapply
:
names.trans <- setNames(df$V1, df$V2)
lapply(genes, function(g) unname(names.trans[g]))
# $a
# [1] "ENSDARG00000054560" NA "ENSDARG00000094426"
# [4] "ENSDARG00000056732" "ENSDARG00000010791"
#
# $b
# [1] "ENSDARG00000087798" "ENSDARG00000057296" "ENSDARG00000068691"
# [4] "ENSDARG00000008131"
#
# $c
# [1] "ENSDARG00000028306" "ENSDARG00000007406" "ENSDARG00000091029"
# [4] "ENSDARG00000014490" "ENSDARG00000006356"
CodePudding user response:
Something like this might work for you
sapply(genes, function(x) df$V1[grep(paste(x, collapse="|"), df$V2)])
$a
[1] "ENSDARG00000010791" "ENSDARG00000056732" "ENSDARG00000094426"
[4] "ENSDARG00000054560"
$b
[1] "ENSDARG00000008131" "ENSDARG00000068691" "ENSDARG00000057296"
[4] "ENSDARG00000087798"
$c
[1] "ENSDARG00000091029" "ENSDARG00000006356" "ENSDARG00000014490"
[4] "ENSDARG00000007406" "ENSDARG00000028306"