I have a df and I want to do a simple calculation using test*factor
. How could I take care of the signs in front of the value?
df<-structure(list(test = structure(c(3L, 2L, 5L, 4L, 1L), .Label = c("<=1.5",
"<5", ">=1", ">7.8", "6"), class = "factor"), factor = c(2, 4,
6, 9, 8)), row.names = c(NA, -5L), class = "data.frame")
CodePudding user response:
Another possible solution:
library(tidyverse)
df %>%
mutate(outcome = str_c(str_remove(test, "\\d "), factor *
as.numeric(str_extract(test, "\\d "))))
#> test factor outcome
#> 1 >=1 2 >=2
#> 2 <5 4 <20
#> 3 6 6 36
#> 4 >78 9 >702
#> 5 <=15 8 <=120
Or with parse_number
:
library(tidyverse)
df %>%
mutate(outcome = str_c(str_remove(test, "\\d "),
factor * parse_number(test %>% as.character())))
#> test factor outcome
#> 1 >=1 2 >=2
#> 2 <5 4 <20
#> 3 6 6 36
#> 4 >78 9 >702
#> 5 <=15 8 <=120
With decimals:
library(tidyverse)
df %>%
mutate(outcome = str_c(str_remove(df$test, "\\d (\\.\\d )?"),
factor * parse_number(test %>% as.character())))
#> test factor outcome
#> 1 >=1 2 >=2
#> 2 <5 4 <20
#> 3 6 6 36
#> 4 >7.8 9 >70.2
#> 5 <=1.5 8 <=12
CodePudding user response:
We could use str_replace
- in the pattern, match one or more digits (\\d
) and replace with a function to multiply with the factor
column after converting to numeric
library(stringr)
df$outcome <- str_replace(df$test, "\\d ", function(x) as.numeric(x) * df$factor)
-output
df$outcome
[1] ">=2" "<20" "36" ">702" "<=120"
CodePudding user response:
One more. Here we use parse_number
to multiply and then paste
after removing the numbers from test
column:
library(dplyr)
library(readr)
df %>%
mutate(outcome = paste0(str_replace(test, '\\d ', ''), factor*parse_number(as.character(test))
))
test factor outcome
1 >=1 2 >=2
2 <5 4 <20
3 6 6 36
4 >78 9 >702
5 <=15 8 <=120
CodePudding user response:
We can try
transform(
df,
outcome = paste0(gsub("\\d","",test),as.numeric(gsub("\\D", "", test)) * factor)
)
which gives
test factor outcome
1 >=1 2 >=2
2 <5 4 <20
3 6 6 36
4 >78 9 >702
5 <=15 8 <=120
CodePudding user response:
And another solution using separate
:
library(tidyr)
library(stringr)
library(dplyr)
df %>%
separate(test,
into = c("temp1", "temp2"),
sep = "(?<=\\D|^)(?=\\d )",
remove = FALSE) %>%
mutate(
temp3 = as.numeric(temp2)*factor,
outcome = str_c(temp1, temp3)
) %>%
select(-matches("temp"))
test factor outcome
1 >=1 2 >=2
2 <5 4 <20
3 6 6 36
4 >78 9 >702
5 <=15 8 <=120