Home > front end >  How do I remove a substring from a value in an elasticsearch document using their devtools?
How do I remove a substring from a value in an elasticsearch document using their devtools?

Time:01-04

If each document has a value that is similar to:

https://test.com/MODIF-RRS/D:/D-KGQLUL34TURWW-MODIF-AGENT04/_work/1179/s/test/code.cs and I want to remove the D:/D-KGQLUL34TURWW-MODIF-AGENT04/_work/1179/s/ part so I am left with https://test.com/MODIF-RRS/test/code.cs how would I do that?

I have a regex that works using an online tester

(D:/([a-zA-Z0-9_-] )/_work/([a-zA-Z0-9_-] )/s/)

but it gave me an error: invalid range: from (95) cannot be > to (93)

CodePudding user response:

I used char filter with your regex.

POST _analyze
{
  "char_filter": {
    "type":"pattern_replace",
    "pattern":"(D:/([a-zA-Z0-9_-] )/_work/([a-zA-Z0-9_-] )/s/)"
  },
  "text": "https://test.com/MODIF-RRS/D:/D-KGQLUL34TURWW-MODIF-AGENT04/_work/1179/s/test/code.cs"
}

Token

{
  "tokens": [
    {
      "token": "https://test.com/MODIF-RRS/test/code.cs",
      "start_offset": 0,
      "end_offset": 85,
      "type": "word",
      "position": 0
    }
  ]
}

CodePudding user response:

(D:/([a-zA-Z0-9_-] )/_work/([a-zA-Z0-9_-] )/s/)
> invalid range: from (95) cannot be > to (93)

ASCII character 95 is _ and ASCII character 93 is ].
The parser thinks _-] is supposed to be a range of characters (similar to A-Z) and is confused because the ASCII values left and right of - are not in ascending order.

As you do not want to specify a range there are all, try escaping the - characters with a leading \, so that the parser knows you mean a literal -, not a range of characters:

(D:/([a-zA-Z0-9_\-] )/_work/([a-zA-Z0-9_\-] )/s/)

Note: Depending on how you specify your regex (in JSON?), you may have to escape the \ itself as well, so you'd have to write \\- instead of \-.

Alternatively it's usually possible to specify - as first character in the set, then the parser realizes it cannot be a range.

(D:/([-a-zA-Z0-9_] )/_work/([-a-zA-Z0-9_] )/s/)
  • Related