For R
, I have a string that contains information about 3 grades. They look like
"First Grade|Third Grade|Second Grade|Third Grade|First Grade"
I would like to convert this into a vector, which I am hoping to equivalent to the output of:
c("First Grade","Third Grade","Second Grade","Third Grade","First Grade")
> [1] "First Grade" "Third Grade" "Second Grade" "Third Grade" "First Grade"
Is there a way to do this in R? Thanks.
CodePudding user response:
With stringr
, you can use str_split
. With simplify = TRUE
, the output would be a matrix, and we can use c()
to combine them into a vector. Note that we'll need to escape the |
sign with double slashes \\
.
library(stringr)
string <- "First Grade|Third Grade|Second Grade|Third Grade|First Grade"
c(str_split(string, "\\|", simplify = T))
[1] "First Grade" "Third Grade" "Second Grade" "Third Grade"
[5] "First Grade"
CodePudding user response:
1) scan Assuming the input is x
shown in the Note at the end, we can use scan
. The text= argument is the input, the what= argument tells it to regard the fields as character, the sep= argument gives the separator character and the quiet= argument tells it not to display additional information. No packages are used.
scan(text = x, what = "", sep = "|", quiet = TRUE)
## [1] "First Grade" "Third Grade" "Second Grade" "Third Grade" "First Grade"
2) strsplit/unlist Another possibility is strsplit
followed by unlist
. The fixed=TRUE argument tells it to regard | as an ordinary character, otherwise it has special meaning which we do not want here. strsplit
produces a one element list containing the required vector so we unlist it to just get the vector. Again, no packages are used.
unlist(strsplit(x, "|", fixed = TRUE))
## [1] "First Grade" "Third Grade" "Second Grade" "Third Grade" "First Grade"
This could also be expressed as a pipeline:
x |> strsplit("|", fixed = TRUE) |> unlist()
## [1] "First Grade" "Third Grade" "Second Grade" "Third Grade" "First Grade"
If the input were actually a vector of character strings such as c(x, x)
then we could omit the unlist part and we would get a list of character strings as output.
Note
x <- "First Grade|Third Grade|Second Grade|Third Grade|First Grade"