In the dataframe absolute values and percentages are combined, and I want to split them into 2 separated columns:
df <- data.frame (Sales = c("74(2.08%)",
"71(2.00%)",
"58(1.63%)",
"42(1.18%)"))
Sales
1 74(2.08%)
2 71(2.00%)
3 58(1.63%)
4 42(1.18%)
Expected output
Sales Share
1 74 2.08
2 71 2.00
3 58 1.63
4 42 1.18
CodePudding user response:
in Base R:
read.table(text=gsub("[()%]", ' ', df$Sales), col.names = c("Sales", "Share"))
Sales Share
1 74 2.08
2 71 2.00
3 58 1.63
4 42 1.18
df %>%
separate(Sales, c("Sales", "Share"), sep='[()%]', extra = 'drop', convert = TRUE)
Sales Share
1 74 2.08
2 71 2.00
3 58 1.63
4 42 1.18
CodePudding user response:
Using tidyr::extract
you could split your column into separate columns using a regex:
library(tidyr)
df |>
extract(Sales, into = c("Sales", "Share"), regex = "^(\\d )\\((\\d \\.\\d )\\%\\)$", convert = TRUE)
#> Sales Share
#> 1 74 2.08
#> 2 71 2.00
#> 3 58 1.63
#> 4 42 1.18