I am currently working on some research of online forums. I have a database with thousands of posts and want to create a binary variable on the specific post (which is an observation in my dataset) when a certain word is mentioned.
I want to see when posters talk about being lonely, so I have come up with the following code, but I keep getting an error when I use ignore_case = T
.
library(dplyr)
library(string)
dataset <- dataset %>%
mutate(loneliness = ifelse(str_detect(text,"loneliness|blackpilled|lonely"), 1, 0, ignore_case = TRUE))
I have also tried:
mutate(loneliness = ifelse(
str_detect(dataset$text, regex("loneliness|blackpilled|black pill|lonely", ignore_case = TRUE))))
Using that I get this error: argument "no" is missing, with no default.
What am I missing in my code that it is not working?
CodePudding user response:
you just added ignore_case
inside the base R ifelse()
, which is not an argument of that function. Using dplyr
and stringr
works perfectly like so:
Data <- data.frame(text = c('I am lonely','I am happy'))
library(tidyverse)
Data |>
mutate(
loneliness = if_else(
condition = str_detect(text, pattern = "loneliness|blackpilled|lonely"),
1L, 0L
)
)
#> text loneliness
#> 1 I am lonely 1
#> 2 I am happy 0
Created on 2022-11-20 with reprex v2.0.2
Kind regards
CodePudding user response:
On top of str_detect
not having ignore_case
natively it's at the wrong position (outside of str_detect
function but in ifelse
).
Using regex
should work though
dataset %>%
mutate(loneliness = ifelse(
str_detect(text,
regex("loneliness|blackpilled|lonely", ignore_case = T)
), 1, 0))