sum = c("40 Da 12 Da de primes", "40 Da 12 Da de primes", "50 Da", "50 Da", "50 Da")
How do I separate such a variable into the following columns:
Price | Bonus |
---|---|
40 | 12 |
40 | 12 |
50 | 0 |
50 | 0 |
50 | 0 |
CodePudding user response:
Suppose the character vector is stored in a data.frame called df
:
library(stringr)
library(dplyr)
df %>%
mutate(
Price = as.numeric(coalesce(str_extract(sum, "^\\d (?=\\sDa)"), "0")),
Bonus = as.numeric(coalesce(str_extract(sum, "\\d (?=\\sDa de primes)"), "0"))
)
This returns
sum Price Bonus
1 40 Da 12 Da de primes 40 12
2 40 Da 12 Da de primes 40 12
3 50 Da 50 0
4 50 Da 50 0
5 50 Da 50 0
CodePudding user response:
We could use extract
from tidyr
library(dplyr)
library(tidyr)
tibble(sum) %>%
extract(sum, into = c("Price", "Bonus"),
"^(\\d )\\D (\\d )?.*", convert = TRUE) %>%
mutate(Bonus = replace_na(Bonus, 0))
# A tibble: 5 × 2
Price Bonus
<int> <dbl>
1 40 12
2 40 12
3 50 0
4 50 0
5 50 0
data
sum = c("40 Da 12 Da de primes", "40 Da 12 Da de primes", "50 Da", "50 Da", "50 Da")