Home > Software engineering >  str_detect, case sensitivity, and mutating a binary variable
str_detect, case sensitivity, and mutating a binary variable

Time:11-21

I am currently working on some research of online forums. I have a database with thousands of posts and want to create a binary variable on the specific post (which is an observation in my dataset) when a certain word is mentioned.

I want to see when posters talk about being lonely, so I have come up with the following code, but I keep getting an error when I use ignore_case = T.

library(dplyr)
library(string)

dataset <- dataset %>% 
    mutate(loneliness = ifelse(str_detect(text,"loneliness|blackpilled|lonely"), 1, 0, ignore_case = TRUE))

I have also tried:

mutate(loneliness = ifelse(
  str_detect(dataset$text, regex("loneliness|blackpilled|black pill|lonely", ignore_case = TRUE))))

Using that I get this error: argument "no" is missing, with no default.

What am I missing in my code that it is not working?

CodePudding user response:

you just added ignore_case inside the base R ifelse(), which is not an argument of that function. Using dplyr and stringr works perfectly like so:

Data <- data.frame(text = c('I am lonely','I am happy'))
library(tidyverse)
Data |>
  mutate(
    loneliness = if_else(
      condition = str_detect(text, pattern = "loneliness|blackpilled|lonely"),
      1L, 0L
    )
  )
#>          text loneliness
#> 1 I am lonely          1
#> 2  I am happy          0

Created on 2022-11-20 with reprex v2.0.2

Kind regards

CodePudding user response:

On top of str_detect not having ignore_case natively it's at the wrong position (outside of str_detect function but in ifelse).

Using regex should work though

dataset %>% 
  mutate(loneliness = ifelse(
    str_detect(text, 
      regex("loneliness|blackpilled|lonely", ignore_case = T)
    ), 1, 0))
  • Related