I have
x<-"1, A | 2, B | 10, C "
x
is always this way formatted, |
denotes a new row and the first value is the variable1
, the second value is variable2
.
I would like to convert it to a data.frame
variable1 variable2
1 1 A
2 2 B
3 10 C
I haven't found any package that can understand the escape character |
How can I convert it to data.frame
?
CodePudding user response:
We may use read.table
from base R
to read the string into two columns after replacing the |
with \n
read.table(text = gsub("|", "\n", x, fixed = TRUE), sep=",",
header = FALSE, col.names = c("variable1", "variable2"), strip.white = TRUE )
-output
variable1 variable2
1 1 A
2 2 B
3 10 C
Or use fread
from data.table
library(data.table)
fread(gsub("|", "\n", x, fixed = TRUE), col.names = c("variable1", "variable2"))
variable1 variable2
1: 1 A
2: 2 B
3: 10 C
Or using tidyverse
- separate_rows
to split the column and then create two columns with separate
library(tidyr)
library(dplyr)
tibble(x = trimws(x)) %>%
separate_rows(x, sep = "\\s*\\|\\s*") %>%
separate(x, into = c("variable1", "variable2"), sep=",\\s ", convert = TRUE)
# A tibble: 3 × 2
variable1 variable2
<int> <chr>
1 1 A
2 2 B
3 10 C
CodePudding user response:
Here's a way using scan()
.
x <- "1, A | 2, B | 10, C "
do.call(rbind.data.frame,
strsplit(scan(text=x, what="A", sep='|', quiet=T, strip.white=T), ', ')) |>
setNames(c('variable1', 'variable2'))
# variable1 variable2
# 1 1 A
# 2 2 B
# 3 10 C
Note: R version 4.1.2 (2021-11-01)
.