I have this list ("sample_list");
[1] "http://www.website.ca/extra/city1-aaa-bbb-ccc/"
[2] "http://www.website.ca/extra/acity2-a2a-bbb-ccc/"
[3] "http://www.website.ca/extra/bbcity3-a3a-bbb-ccc/"
[4] "http://www.website.ca/extra/ccccity4-a77a-bbb-ccc/"
[5] "http://www.website.ca/extra/dddddcity5-a2a-bbb-ccc/"
I want to extract the following parts from this list: city1, acity2, bbcity3, ccccity4, dddddcity5
I had the following idea about this. I noticed that for all elements in this list, the first position is always the same position "http://www.website.ca/extra/
(29th position).
my_substr = substr(sample_list, 1,29)
- Is there someway I can modify the sustring function so that everything is selected from the 29th position all the way to the first hyphen?
Thank you!
CodePudding user response:
x = c("http://www.website.ca/extra/city1-aaa-bbb-ccc/", "http://www.website.ca/extra/acity2-aaa-bbb-ccc/",
"http://www.website.ca/extra/bbcity3-aaa-bbb-ccc/", "http://www.website.ca/extra/ccccity4-aaa-bbb-ccc/",
"http://www.website.ca/extra/dddddcity5-aaa-bbb-ccc/")
From the 29th position all the way to the first hyphen? Yes,
substring(x, 29, stringr::str_locate(x, "-")[,1] - 1)
although other options exist for such task. Depending on preference, this might be more suitable.
stringr::str_extract(x, "(?<=extra/).*(?=-aaa-)")
CodePudding user response:
Simply use str_extract
from stringr package
library(stringr)
strings <- c(
"http://www.website.ca/extra/city1-aaa-bbb-ccc/",
"http://www.website.ca/extra/acity2-aaa-bbb-ccc/",
"http://www.website.ca/extra/bbcity3-aaa-bbb-ccc/",
"http://www.website.ca/extra/ccccity4-aaa-bbb-ccc/",
"http://www.website.ca/extra/dddddcity5-aaa-bbb-ccc/"
)
str_extract(strings, "(?<=extra\\/)\\w (?=-aaa-)")
#> [1] "city1" "acity2" "bbcity3" "ccccity4" "dddddcity5"
Created on 2022-07-07 by the reprex package (v2.0.1)