replacing items in nested list with corresponding items from df column

I have a nested list with multiple gene names in each sublist:

genes = list(c("her15.1", "her15.2", "her4.2", "her4.1", "dla"), c("pdyn", "cbln1", "kctd4", "sox1b" ), c("prph","phox2a", "phox2bb",  "tac1", "slc18a3a"))
genes <- setNames(genes, c("a", "b", "c"))

I also have a df looking like this:

                      V1       V2
8694  ENSDARG00000008131    sox1b
8855  ENSDARG00000010791      dla
9408  ENSDARG00000068691    kctd4
14309 ENSDARG00000091029  phox2bb
15322 ENSDARG00000006356 slc18a3a
16000 ENSDARG00000057296    cbln1
17897 ENSDARG00000014490     tac1
19208 ENSDARG00000007406   phox2a
19593 ENSDARG00000056732   her4.1
19594 ENSDARG00000094426   her4.2
19975 ENSDARG00000087798     pdyn
22102 ENSDARG00000028306     prph
22717 ENSDARG00000054560  her15.1

Each item in the list is also in df$V2. I would like to replace each item in the list of lists with the corresponding item from df$V1 based on the matching in df$V2.

Thank you!

CodePudding user response：

I would create a translation vector with V1 as values and V2 as names and use it inside lapply:

names.trans <- setNames(df$V1, df$V2)
lapply(genes, function(g) unname(names.trans[g]))

# $a
# [1] "ENSDARG00000054560" NA                   "ENSDARG00000094426"
# [4] "ENSDARG00000056732" "ENSDARG00000010791"
# 
# $b
# [1] "ENSDARG00000087798" "ENSDARG00000057296" "ENSDARG00000068691"
# [4] "ENSDARG00000008131"
# 
# $c
# [1] "ENSDARG00000028306" "ENSDARG00000007406" "ENSDARG00000091029"
# [4] "ENSDARG00000014490" "ENSDARG00000006356"

CodePudding user response：

Something like this might work for you

sapply(genes, function(x) df$V1[grep(paste(x, collapse="|"), df$V2)])
$a
[1] "ENSDARG00000010791" "ENSDARG00000056732" "ENSDARG00000094426"
[4] "ENSDARG00000054560"

$b
[1] "ENSDARG00000008131" "ENSDARG00000068691" "ENSDARG00000057296"
[4] "ENSDARG00000087798"

$c
[1] "ENSDARG00000091029" "ENSDARG00000006356" "ENSDARG00000014490"
[4] "ENSDARG00000007406" "ENSDARG00000028306"