Home > Back-end >  How to filter indices from elastic
How to filter indices from elastic

Time:08-10

I have a list of indices:

log$org$2018
log$org$2019_v2
log$org$2021_v5
log$org$2019
log$org_test$2020_v3
log$org_test$2019_v2
log$org_test$2021_v5
log$org$2019_v3

I want to keep only the following indices:

log$org$2018
log$org$2019_v2
log$org$2021_v5
log$org$2019
log$org$2019_v3

i.e: filter out indices that doesn't match log$org$* format

I'm using get function to get the indices and use regex to match the indices I want), but I get None.

Code

from elasticsearch6 import Elasticsearch

elasticsearch = Elasticsearch()
elasticsearch.indices.get(index=f"log$org$*")

Logs

Starting new HTTP connection (1): elasticsearch:9200
elasticsearch:9200 "GET /log$org$* HTTP/1.1" 200 2
GET elasticsearch:9200/log$org$* [status:200 request:0.010s]
> None

I assume its because the indices are composed of $ sign and it seems to cause the issue. I've also tried to escape $ chars but it still doesn't get any indices..

Would like your help on this :)

CodePudding user response:

Tldr;

I believe you have an issue with your string interpolation. What is the value for the org variable ?

You do not need to escape the $ char

To test

POST _bulk
{"index":{"_index":"log$org$2018"}}
{"index":"log$org$2018"}
{"index":{"_index":"log$org$2019_v2"}}
{"index":"log$org$2019_v2"}
{"index":{"_index":"log$org$2021_v5"}}
{"index":"log$org$2021_v5"}
{"index":{"_index":"log$org$2019"}}
{"index":"log$org$2019"}
{"index":{"_index":"log$org_test$2020_v3"}}
{"index":"log$org_test$2020_v3"}
{"index":{"_index":"log$org_test$2019_v2"}}
{"index":"log$org_test$2019_v2"}
{"index":{"_index":"log$org_test$2021_v5"}}
{"index":"log$org_test$2021_v5"}
{"index":{"_index":"log$org$2019_v3"}}
{"index":"log$org$2019_v3"}

GET /log$org$*/

Will return

{
  "log$org$2018": {
  },
  "log$org$2019": {
  },
  "log$org$2019_v2": {
  },
  "log$org$2019_v3": {
  },
  "log$org$2021_v5": {
  }
}

And when queried with python

org = "org"
es.indices.get(index=f'log${org}$*')

I get the same result.

CodePudding user response:

The solution is to use HEX instead of special characters..

As I saw in the logs, elastic parse special chars to HEX (i.e: $ -> $)

GET elasticsearch:9200/log$org$* [status:200 request:0.010s]

Hence, the only thing I needed to do is replacing it in the query, like so:

elasticsearch.indices.get(index=f"log$org$*")

That way I could filter indices by regex via get

Pretty annoying they haven't mentioned this in their docs..

Thanks who made an effort on this!

  • Related