I have the following table:
library( tidyverse )
data = read.table(text="gene1
gene2
gene3", , sep="\t", col.names = c("Protein"))
And the following two lists:
genes = c("gene1", "gene3")
genes_names = c("name1", "name3")
Each item in gene_names
corresponds to each item in genes
with the same index.
Now, I want to make a new column in data
called ToLabel
, that holds the item in gene_names
if the column value in data$Protein
matches genes
.
data %>% mutate( ToLabel = ifelse( Protein %in% genes, genes_names, "no" ) )
This does not work as expected. My expected outcome:
Protein ToLabel
gene1 name1
gene2 no
gene3 name3
CodePudding user response:
Use recode
:
data %>%
mutate(Protein = str_squish(Protein),
ToLabel = recode(Protein, !!!set_names(genes_names, genes), .default = 'no'))
Protein ToLabel
1 gene1 name1
2 gene2 no
3 gene3 name3
CodePudding user response:
Use a join if we want to replace multiple values by matching
library(dplyr)
data %>%
mutate(Protein = trimws(Protein)) %>%
left_join(tibble(Protein = genes, ToLabel = genes_names)) %>%
mutate(ToLabel = coalesce(ToLabel, "no"))
-output
Protein ToLabel
1 gene1 name1
2 gene2 no
3 gene3 name3
CodePudding user response:
You can use use your code with some modifications
library( tidyverse )
data |> rowwise() |> mutate(Protein = trimws(c_across()) ,
ToLabel = ifelse( c_across() %in% genes, genes_names[which(c_across() == genes)],
"no" ) ) |> ungroup()
- output
# A tibble: 3 × 2
Protein ToLabel
<chr> <chr>
1 gene1 name1
2 gene2 no
3 gene3 name3
CodePudding user response:
A base R option using merge
replace
transform(
merge(
transform(data, Protein = trimws(Protein)),
data.frame(
genes = c("gene1", "gene3"),
genes_names = c("name1", "name3")
),
by.x = "Protein",
by.y = "genes",
all.x = TRUE
),
genes_names = replace(genes_names, is.na(genes_names), "no")
)
gives
Protein genes_names
1 gene1 name1
2 gene2 no
3 gene3 name3
CodePudding user response:
You can use match()
:
ToLabel <- genes_names[match(trimws(data$Protein), genes)]
ToLabel[is.na(ToLabel)] <- "no"
data$ToLabel <- ToLabel
data
#> Protein ToLabel
#> 1 gene1 name1
#> 2 gene2 no
#> 3 gene3 name3