Home > OS >  Removing all data Except Certain text using Regex
Removing all data Except Certain text using Regex

Time:11-09

I am trying to remove all words from a string except certain words, for example I want to retain 'red' and 'black' including all combinations it has and remove all other strings.

For example inputstring = "red => white => green => black,magenta" outputstring = "red => black,magenta"

I have tried to replace string using regex pattern as = "^red|^black" but it does not change anything in the string.

I am using X query and am unable to figure out pattern to be used in Regex let $pattern := "^red|^black" let $pattern := "" let $y := replace($inputstring,$pattern,$replacement)

CodePudding user response:

I would maybe create new string. I mean you do not have to remain the original one. You can simply look for words and their order.

I am not experienced with xquery but I could help you with idea at least.

What do I mean by that. If I understand you correctly then your input string contains => pattern indicating next word. So if yes I would solve it like this:

words_list = input_string.split(' => ')
output_list = list()
for one_word in words_list:
   if one_word in ['red', 'black', any_other_wanted_colors]:
      output_list.append(one_word)

output_string = ' => '.join(output_list)
return output_string

CodePudding user response:

In R, extract all instances containing “red” or “black”, then concatenate these back together with " => " as a separator:

library(stringr)
  
x |>
  str_extract_all("\\S*(red|black)\\S*") |> 
  sapply(str_c, collapse = " => ")
# "red => black,magenta"

Note this is vectorized, so can also process a vector of multiple input strings.

CodePudding user response:

In XQuery 3.1 (or XPath 3.1) I would do

$input 
 => tokenize('\s?=>\s?') 
 => filter(matches(?, 'red|black'))
 => string-join(' => ')
  • Related