Home > Mobile >  How to get specific strings based on complex pattern with, regular expression
How to get specific strings based on complex pattern with, regular expression

Time:01-02

path<-c("C:/Users/SEYDOU GORO/Dropbox/PC (2)/Documents/BETCLIC/DONNEES/argentina-primera-division-matches-2022-to-2022-stats.csv",
"C:/Users/SEYDOU GORO/Dropbox/PC (2)/Documents/BETCLIC/DONNEES/argentina-primera-division-matches-2022-to-2022-stats.csv",
"C:/Users/SEYDOU GORO/Dropbox/PC (2)/Documents/BETCLIC/DONNEES/france-ligue-2-matches-2021-to-2022-stats.csv",
"C:/Users/SEYDOU GORO/Dropbox/PC (2)/Documents/BETCLIC/DONNEES/germany-2-bundesliga-matches-2021-to-2022-stats.csv")

mydata<-data.frame(path=path)

I want to create new variables out of variable path

Country the pattern is: DONNEES/country name -`

League the pattern is: DONNEES/ a word - League name -matches

Year the pattern is: - The year -to

This should the resulting dataset:

                                                                                                                     path   Country           League Year
1 C:/Users/SEYDOU GORO/Dropbox/PC (2)/Documents/BETCLIC/DONNEES/argentina-primera-division-matches-2022-to-2022-stats.csv argentina primera-division 2022
2 C:/Users/SEYDOU GORO/Dropbox/PC (2)/Documents/BETCLIC/DONNEES/argentina-primera-division-matches-2022-to-2022-stats.csv argentina primera-division  202
3             C:/Users/SEYDOU GORO/Dropbox/PC (2)/Documents/BETCLIC/DONNEES/france-ligue-2-matches-2021-to-2022-stats.csv    france          ligue-2 2021
4       C:/Users/SEYDOU GORO/Dropbox/PC (2)/Documents/BETCLIC/DONNEES/germany-2-bundesliga-matches-2021-to-2022-stats.csv   germany     2-bundesliga 2021

CodePudding user response:

We may use extract to capture the word after the DONNEES/, followed by the -, capture the characters (.*) before the -matches- and the year part as 4 digits (\\d{4})

library(tidyr)
extract(mydata, path, into = c("Country", "League", "Year"),
     ".*DONNEES/(\\w )-(.*)-matches-(\\d{4})-.*", remove = FALSE)

-output

                                                                                                                     path   Country
1 C:/Users/SEYDOU GORO/Dropbox/PC (2)/Documents/BETCLIC/DONNEES/argentina-primera-division-matches-2022-to-2022-stats.csv argentina
2 C:/Users/SEYDOU GORO/Dropbox/PC (2)/Documents/BETCLIC/DONNEES/argentina-primera-division-matches-2022-to-2022-stats.csv argentina
3             C:/Users/SEYDOU GORO/Dropbox/PC (2)/Documents/BETCLIC/DONNEES/france-ligue-2-matches-2021-to-2022-stats.csv    france
4       C:/Users/SEYDOU GORO/Dropbox/PC (2)/Documents/BETCLIC/DONNEES/germany-2-bundesliga-matches-2021-to-2022-stats.csv   germany
            League Year
1 primera-division 2022
2 primera-division 2022
3          ligue-2 2021
4     2-bundesliga 2021
  • Related