I am trying to collect everything before a specific set of characters
i.e. I have a URL such as the following
url = "https://www.somewebsiteLink.com/someDirectory/Directory/ascensor/163235494/d"
url2 = "https://www.somewebsiteLink.com/someDirectory/Directory/aire-acondicionado-calefaccion-ascensor/45837493/d
I would like to extract two things from the links:
Link 1: ascensor
and 163235494
Link 2: aire-acondicionado-calefaccion-ascensor
and 45837493
So, the numbers between the last but one /
and also the text between the last but 2 /
.
CodePudding user response:
Split the string on /
and pull the 3rd and 2nd to last elements:
url = "https://www.somewebsiteLink.com/someDirectory/Directory/ascensor/163235494/d"
url2 = "https://www.somewebsiteLink.com/someDirectory/Directory/aire-acondicionado-calefaccion-ascensor/45837493/d"
urls = c(url, url2)
pieces = strsplit(urls, split = "/")
result = lapply(pieces, \(x) x[length(x) - 2:1])
## for older R verions:
# result = lapply(pieces, function(x) x[length(x) - 2:1])
result
# [[1]]
# [1] "ascensor" "163235494"
#
# [[2]]
# [1] "aire-acondicionado-calefaccion-ascensor" "45837493"