Home > Mobile >  Extract matches in a song in R
Extract matches in a song in R

Time:10-21

I'm using R to extract sentences that contain specific words ("everything", "along", "wrote") from the lyrics of a song and here is the song: Yellow- Coldplay

Look at the stars Look how they shine for you And everything you do Yeah, they were all yellow I came along I wrote a song for you And all the things you do And it was called Yellow So, then I took my turn What a thing to've done And it was all yellow Your skin Oh yeah, your skin and bones Turn in to something beautiful Do you know You know I love you so You know I love you so I swam across I jumped across for you What a thing to do 'Cause you were all yellow I drew a line

Create a vector with the letter, but it does not compile

CodePudding user response:

Here's an option using tidyverse. It's not perfect and you'll have to adapt to your specific use case:

lyrics <- data.frame(yellow = "Look at the stars Look how they shine for you And everything you do Yeah, they were all yellow I came along I wrote a song for you And all the things you do And it was called Yellow So, then I took my turn What a thing to've done And it was all yellow Your skin Oh yeah, your skin and bones Turn in to something beautiful Do you know You know I love you so You know I love you so I swam across I jumped across for you What a thing to do 'Cause you were all yellow I drew a line")


library(tidyverse)

lyrics %>% 
  mutate(yellow = gsub('([[:upper:]])', '<>\\1', yellow)) %>% 
  separate_rows(yellow, sep = "<>") %>% 
  mutate(flag = str_detect(yellow, "everything|along|wrote")) %>% 
  filter(flag == T)

This gives us:

# A tibble: 3 x 2
  yellow                    flag 
  <chr>                     <lgl>
1 "And everything you do "  TRUE 
2 "I came along "           TRUE 
3 "I wrote a song for you " TRUE 

You have to figure out: What constitutes a sentence? I counted a new sentence when there was capitalization.

  •  Tags:  
  • r
  • Related