Home > Enterprise >  Conditional recoding of string values to numeric
Conditional recoding of string values to numeric

Time:08-27

I am trying to convert the values of the 'age' column such that the value will increase by 1 if the months is more than 6 e.g. '56 years 8 months' to '57', or keep it as it is if months is <=6 e.g. '43 years 5 months' to '43'.

This is the data I have:

> age=data.frame(age=c(12, '43 years 5 months', 34, '56 years 8 months'))

> age
                age
1                12
2 43 years 5 months
3                34
4 56 years 8 months

And this is what I want:

> age
  age
1  12
2  43
3  34
4  57

I am sorry about the poor phrasing of the question, but I hope the description makes it clearer.

Thanks in advance.

CodePudding user response:

We could round after replacing the years and months

library(dplyr)
library(stringr)
library(purrr)
age %>% 
  mutate(age = round(map_dbl(str_replace_all(str_replace(age,
    "(\\d \\s months)", " \\1"), c(years = "*1", months = "*1/12")),
      ~ eval(parse(text = .x)))))

-output

  age
1  12
2  43
3  34
4  57

Or may also split the years, months, parse the digits, and then combine

library(tidyr)
age %>% 
  separate(age, into = c("years", "months"), "\\s (?=\\d )", fill = "right") %>%
  mutate(across(c(years, months), readr::parse_number), 
        months = months * 1/12) %>% 
  transmute(age = round(rowSums(cur_data(), na.rm = TRUE)))
  age
1  12
2  43
3  34
4  57

CodePudding user response:

We can use parse_number from readr package with an ifelse statement:

library(readr)
library(dplyr)
library(stringr)

age %>% 
  mutate(age2 = parse_number(age),
         age = parse_number(str_extract(age, '\\d month')),
         age2 = if_else(age > 6 & !is.na(age), age2 1, age2)) %>% 
  select(age = age2)


 age
1  12
2  43
3  34
4  57

CodePudding user response:

We could also split and use an if_else:

library(dplyr)
library(stringr)

age |> 
  mutate(years = as.numeric(str_extract(age, "\\d ")),
         months = as.numeric(str_extract(age, "\\d (?= months)")),
         new_age = if_else(months > 6, years 1, years, missing = years))

Output:

                age years months new_age
1                12    12     NA      12
2 43 years 5 months    43      5      43
3                34    34     NA      34
4 56 years 8 months    56      8      57
  • Related