I am trying to take a column from a dataset which has strings of either 2 or more lengths.
For instance:
Col1. Col2. Col3. Col4.
XX XX XX XX
XX XX XXGH XX
XX XX XXGHO XX
XX XX XX XX
...and so on.
I'd like to convert Col3 into Col5, taking the strings in the column and turning them into 1s and 0s, where greater than 2 string lengths is 1 and 2 string lengths is 0.
I am trying to use ifelse, but I am not really getting anywhere.
It should end up looking something like this:
Col1. Col2. Col3. Col4. Col5
XX XX XX XX 0
XX XX XXGH XX 1
XX XX XXGHO XX 1
XX XX XX XX 0
where Col5 is the 1 and 0 equivalent of Col3.
CodePudding user response:
With base R you can try this
dat$Col5. <- as.numeric( nchar( as.character(dat$Col3.) ) > 2 )
Col1. Col2. Col3. Col4. Col5.
1 XX XX XX XX 0
2 XX XX XXGH XX 1
3 XX XX XXGHO XX 1
4 XX XX XX XX 0
CodePudding user response:
The function you're looking for is str_length()
library(tidyverse)
foo <- tibble(
Col3 = c('XX', 'XX', 'XX', 'XXGH', 'XXGHO', 'XX')
)
foo %>%
mutate(Col5 = ifelse(str_length(Col3) > 2, 1, 0))
CodePudding user response:
For simple comparisons, ifelse
isn't needed
Here we check if the number of characters is greater than 2 and multiply the TRUE/FALSE
response by "1" to give a numerical response
foo$Col5 <- (nchar(foo$Col3) > 2) * 1