I am trying to convert the values of the 'age' column such that the value will increase by 1 if the months is more than 6 e.g. '56 years 8 months' to '57', or keep it as it is if months is <=6 e.g. '43 years 5 months' to '43'.
This is the data I have:
> age=data.frame(age=c(12, '43 years 5 months', 34, '56 years 8 months'))
> age
age
1 12
2 43 years 5 months
3 34
4 56 years 8 months
And this is what I want:
> age
age
1 12
2 43
3 34
4 57
I am sorry about the poor phrasing of the question, but I hope the description makes it clearer.
Thanks in advance.
CodePudding user response:
We could round
after replacing the years
and months
library(dplyr)
library(stringr)
library(purrr)
age %>%
mutate(age = round(map_dbl(str_replace_all(str_replace(age,
"(\\d \\s months)", " \\1"), c(years = "*1", months = "*1/12")),
~ eval(parse(text = .x)))))
-output
age
1 12
2 43
3 34
4 57
Or may also split the years, months, parse the digits, and then combine
library(tidyr)
age %>%
separate(age, into = c("years", "months"), "\\s (?=\\d )", fill = "right") %>%
mutate(across(c(years, months), readr::parse_number),
months = months * 1/12) %>%
transmute(age = round(rowSums(cur_data(), na.rm = TRUE)))
age
1 12
2 43
3 34
4 57
CodePudding user response:
We can use parse_number
from readr
package with an ifelse
statement:
library(readr)
library(dplyr)
library(stringr)
age %>%
mutate(age2 = parse_number(age),
age = parse_number(str_extract(age, '\\d month')),
age2 = if_else(age > 6 & !is.na(age), age2 1, age2)) %>%
select(age = age2)
age
1 12
2 43
3 34
4 57
CodePudding user response:
We could also split and use an if_else
:
library(dplyr)
library(stringr)
age |>
mutate(years = as.numeric(str_extract(age, "\\d ")),
months = as.numeric(str_extract(age, "\\d (?= months)")),
new_age = if_else(months > 6, years 1, years, missing = years))
Output:
age years months new_age
1 12 12 NA 12
2 43 years 5 months 43 5 43
3 34 34 NA 34
4 56 years 8 months 56 8 57