I've got a string which is like:
"\r Joel Corry feat. MNEK\r Head & Heart\r "
I want to create a df with two columns with the values of above string between "\r":
interpret | song |
---|---|
Joel Corry feat. MNEK | Head & Heart |
How can i do this in R?
CodePudding user response:
One way would be to use strsplit
:
x <- "\r Joel Corry feat. MNEK\r Head & Heart\r"
df$interpret <- trimws(strsplit(x, split = "\\r")[[1]])[2]
df$song<- trimws(strsplit(x, split = "\\r")[[1]])[3]
CodePudding user response:
library(data.table)
dat = data.table(raw_string = c(
"\r Joel Corry feat. MNEK\r Head & Heart\r ",
"\r Intepret 2\r Song 2\r ",
"\r Intepret 3\r Song 3\r ")
)
dat[,c("interpret","song"):= tstrsplit(str_remove_all(raw_string,"(^\r\\s?)|(\r\\s?$)"),"\r")][]
Output:
raw_string interpret song
<char> <char> <char>
1: \r Joel Corry feat. MNEK\r Head & Heart\r Joel Corry feat. MNEK Head & Heart
2: \r Intepret 2\r Song 2\r Intepret 2 Song 2
3: \r Intepret 3\r Song 3\r Intepret 3 Song 3
CodePudding user response:
Please find below another possibility using the stringr
library
Reprex
- Code
library(stringr)
setNames(as.data.frame(str_extract_all(my_text, "(?<=\\r ). ", simplify = TRUE)), c("interpret", "song"))
- Output
#> interpret song
#> 1 Joel Corry feat. MNEK Head & Heart
- Data
my_text <- "\r Joel Corry feat. MNEK\r Head & Heart\r "
Created on 2022-02-21 by the reprex package (v2.0.1)
CodePudding user response:
Use extract
to define a regex that (i) exhaustively describes the string but (ii) wraps into capture groups (...)
only what you want to extract into the two new columns (and thereby effectively removes anything not inside a capture group). Note that \
needs to escaped by double backslash (in R):
library(tidyr)
df %>%
extract(str,
into = c("interpret", "song"),
regex = "\\\r (.*)\\\r (.*)\\\r")
interpret song
1 Joel Corry feat. MNEK Head & Heart
Data:
df <- data.frame(
str = "\r Joel Corry feat. MNEK\r Head & Heart\r "
)