I have a set, W <- c("a","b","c")
And a dataframe
df <- data.frame(col1 = c(1,2,3), col2 = c("a","b","c"), col3 =c("t","b","p"))
I want to run the %in%
operator on multiple columns to return TRUE/FALSE
for columns 2 and 3. I want column 1 to remain the same.
I know I can do
>df$col1 <- df$col1 %in% W
and
>df$col2 <- df$col2 %in% W
I'm unsure how I can do this in one line. I am also fairly new to R and programming in general.
CodePudding user response:
You could apply a function across both of the columns:
library(tidyverse)
W <- c("a","b","c")
df <- tibble(col1 = c(1,2,3), col2 = c("a","b","c"), col3 =c("t","b","p"))
df |>
mutate(across(c(col2, col3), \(x) x %in% W))
#> # A tibble: 3 x 3
#> col1 col2 col3
#> <dbl> <lgl> <lgl>
#> 1 1 TRUE FALSE
#> 2 2 TRUE TRUE
#> 3 3 TRUE FALSE
CodePudding user response:
You can try:
df[,2:3] <- apply(df[,2:3], 2, function(x) x %in% W)
# col1 col2 col3
#1 1 TRUE FALSE
#2 2 TRUE TRUE
#3 3 TRUE FALSE
The 2
in apply
applies the function across columns (1
would apply across rows). The df[,2:3] identifies only the second and third columns (could also do df[,-1]
).
CodePudding user response:
lapply
is designed for such operations on data frames.
apply
is actually designed for matrices, and works slow on data frames.
The %in%
operator is actually the function `%in%`()
(try help(`%in%`)
), so we may use it's name in the lapply
and don't need an anonymous function (those with function(x) ...
).
df[2:3] <- lapply(df[2:3], `%in%`, W)
df
# col1 col2 col3
# 1 1 TRUE FALSE
# 2 2 TRUE TRUE
# 3 3 TRUE FALSE
CodePudding user response:
Here is a variation of the given solutions:
library(dplyr)
df %>%
mutate(across(-col1, ~. %in% W))
col1 col2 col3
1 1 TRUE FALSE
2 2 TRUE TRUE
3 3 TRUE FALSE
CodePudding user response:
Try as.matrix
> df[-1] <- as.matrix(df[-1]) %in% W
> df
col1 col2 col3
1 1 TRUE FALSE
2 2 TRUE TRUE
3 3 TRUE FALSE