Home > OS >  How to extract several substrings with a foor loop in R
How to extract several substrings with a foor loop in R

Time:12-02

I have the following 100 strings:

 [3] "Department_Complementary_Demand_Converted_Sum" 
 [4] "Department_Home_Demand_Converted_Sum"                   
 [5] "Department_Store A_Demand_Converted_Sum"                
 [6] "Department_Store B_Demand_Converted_Sum"
 ...                
 [100] "Department_Unisex_Demand_Converted_Sum"  

Obviously I can for every string use substr() with different start and end values for the string indices. But as one can see, all the strings start with Department_ and end with _Demand_Converted_Sum. I want to only extract what's inbetween. If there was a way to always start at index 11 from the left and end on index 21 from the left then I can just run a for loop over all the 100 strings above.

Example

Given input: Department_Unisex_Demand_Converted_Sum

Expected output: Unisex

CodePudding user response:

Looks a like a classic case for lookarounds:

library(stringr)
str_extract(str, "(?<=Department_)[^_] (?=_)")
[1] "Complementary" "Home"          "Store A" 

Data:

str <- c("Department_Complementary_Demand_Converted_Sum",
         "Department_Home_Demand_Converted_Sum",
         "Department_Store A_Demand_Converted_Sum")

CodePudding user response:

Using strsplit(),

sapply(strsplit(string, '_'), '[', 2)
# [1] "Complementary" "Home"          "Store A"   

or stringi::stri_sub_all().

unlist(stringi::stri_sub_all(str, 12, -22))
# [1] "Complementary" "Home"          "Store A"      
  • Related