I have a data frame where the last column looks like so:
Signed 1-yr/$2.5M deal with Pacers
Signed 4-yr/$113M deal with Celtics
Signed 3-yr/$30M deal with Pacers
...
These are all strings. I am trying to get the number before -yr
and the number before the M
. So, for the first row, I am trying to get 2.5
and 1
.
I then want to divide like so 2.5
/1
.
so the column would look like so:
2.5
28.25
10
I tried str_extract_all(df$col,"\\d")
but this only gets the numbers into a list. I do not know of a way to accomplish my goal
CodePudding user response:
Another solution using stringr
package:
df <- data.frame(a = c("Signed 1-yr/$2.5M deal with Pacers",
"Signed 4-yr/$113M deal with Celtics",
"Signed 3-yr/$30M deal with Pacers"))
library(tidyverse)
df |>
mutate(diff = as.numeric(str_remove(str_extract(a,"[0-9,.] M"),"M"))/as.numeric(str_remove(str_extract(a,"[0-9]-yr"),"-yr")))
Output:
a diff
1 Signed 1-yr/$2.5M deal with Pacers 2.50
2 Signed 4-yr/$113M deal with Celtics 28.25
3 Signed 3-yr/$30M deal with Pacers 10.00
CodePudding user response:
A possible solution:
library(tidyverse)
df %>%
separate(V1, into = c("V2", "V3"), sep = "/", remove = F) %>%
mutate(result = parse_number(V3) / parse_number(V2)) %>%
select(V1, result)
#> V1 result
#> 1 Signed 1-yr/$2.5M deal with Pacers 2.50
#> 2 Signed 4-yr/$113M deal with Celtics 28.25
#> 3 Signed 3-yr/$30M deal with Pacers 10.00
CodePudding user response:
df %>%
separate(col, c('length', 'value'), sep="/", remove=FALSE) %>%
mutate(length = str_extract(length, "\\d "),
value = str_extract(value, "[[:digit:]] \\.*[[:digit:]]*")) %>%
mutate(value_by_year = as.numeric(value)/as.numeric(length))
# A tibble: 3 x 4
col length value value_by_year
<chr> <chr> <chr> <dbl>
1 Signed 1-yr/$2.5M deal with Pacers 1 2.5 2.5
2 Signed 4-yr/$113M deal with Celtics 4 113 28.2
3 Signed 3-yr/$30M deal with Pacers 3 30 10
CodePudding user response:
Tou could use extract()
from tidyr
:
library(tidyverse)
df %>%
extract(col, c("yr", "M"), "([\\d.] )\\D ([\\d.] )", remove = FALSE, convert = TRUE) %>%
mutate(res = M / yr)
# # A tibble: 3 × 4
# col yr M res
# <chr> <int> <dbl> <dbl>
# 1 Signed 1-yr/$2.5M deal with Pacers 1 2.5 2.5
# 2 Signed 4-yr/$113M deal with Celtics 4 113 28.2
# 3 Signed 3-yr/$30M deal with Pacers 3 30 10
Remember to set convert = TRUE
to transform the component columns into numeric.
Data
df <- tibble(col = c("Signed 1-yr/$2.5M deal with Pacers",
"Signed 4-yr/$113M deal with Celtics",
"Signed 3-yr/$30M deal with Pacers"))