Extract text between specific string in a URL "/"-CodePudding

I am trying to collect everything before a specific set of characters

i.e. I have a URL such as the following

url = "https://www.somewebsiteLink.com/someDirectory/Directory/ascensor/163235494/d"

url2 = "https://www.somewebsiteLink.com/someDirectory/Directory/aire-acondicionado-calefaccion-ascensor/45837493/d

I would like to extract two things from the links:

Link 1: ascensor and 163235494 Link 2: aire-acondicionado-calefaccion-ascensor and 45837493

So, the numbers between the last but one / and also the text between the last but 2 /.

CodePudding user response：

Split the string on / and pull the 3rd and 2nd to last elements:

url = "https://www.somewebsiteLink.com/someDirectory/Directory/ascensor/163235494/d"
url2 = "https://www.somewebsiteLink.com/someDirectory/Directory/aire-acondicionado-calefaccion-ascensor/45837493/d"
urls = c(url, url2)

pieces = strsplit(urls, split = "/")
result = lapply(pieces, \(x) x[length(x) - 2:1])
## for older R verions:
# result = lapply(pieces, function(x) x[length(x) - 2:1])

result                
# [[1]]
# [1] "ascensor"  "163235494"
# 
# [[2]]
# [1] "aire-acondicionado-calefaccion-ascensor" "45837493"