extract values between '\r' in R-CodePudding

I've got a string which is like:

"\r Joel Corry feat. MNEK\r Head & Heart\r "

I want to create a df with two columns with the values of above string between "\r":

interpret	song
Joel Corry feat. MNEK	Head & Heart

How can i do this in R?

CodePudding user response：

One way would be to use strsplit:

x <- "\r Joel Corry feat. MNEK\r Head & Heart\r"    

df$interpret <- trimws(strsplit(x, split = "\\r")[[1]])[2]
df$song<- trimws(strsplit(x, split = "\\r")[[1]])[3]

CodePudding user response：

library(data.table)
dat = data.table(raw_string =  c(
  "\r Joel Corry feat. MNEK\r Head & Heart\r ",
  "\r Intepret 2\r Song 2\r ",
  "\r Intepret 3\r Song 3\r ")
)
dat[,c("interpret","song"):= tstrsplit(str_remove_all(raw_string,"(^\r\\s?)|(\r\\s?$)"),"\r")][]

Output:

                                   raw_string             interpret          song
                                       <char>                <char>        <char>
1: \r Joel Corry feat. MNEK\r Head & Heart\r  Joel Corry feat. MNEK  Head & Heart
2:                  \r Intepret 2\r Song 2\r             Intepret 2        Song 2
3:                  \r Intepret 3\r Song 3\r             Intepret 3        Song 3

CodePudding user response：

Please find below another possibility using the stringr library

Reprex

Code

library(stringr)

setNames(as.data.frame(str_extract_all(my_text, "(?<=\\r ). ", simplify = TRUE)), c("interpret", "song"))

Output

#>               interpret         song
#> 1 Joel Corry feat. MNEK Head & Heart

Data

my_text <- "\r Joel Corry feat. MNEK\r Head & Heart\r "

^{Created on 2022-02-21 by the reprex package (v2.0.1)}

CodePudding user response：

Use extract to define a regex that (i) exhaustively describes the string but (ii) wraps into capture groups (...) only what you want to extract into the two new columns (and thereby effectively removes anything not inside a capture group). Note that \ needs to escaped by double backslash (in R):

library(tidyr)
df %>%
  extract(str,
          into = c("interpret", "song"),
          regex = "\\\r (.*)\\\r (.*)\\\r")
              interpret         song
1 Joel Corry feat. MNEK Head & Heart

Data:

df <- data.frame(
  str = "\r Joel Corry feat. MNEK\r Head & Heart\r "
)