Home > Back-end >  Email not being searched properly in elasticsearch
Email not being searched properly in elasticsearch

Time:04-06

hello I am new to elasticsearch I am having an issue with email search is not working properly I am using boto3 SDK and AWS opensearch service have tried this mapping

{
  "dev_auth0_logs_new_mapping": {
    "mappings": {
      "properties": {
        "activity_date": { "type": "date" },
        "activity_type": { "type": "text" },
        "client_id": { "type": "text" },
        "description": { "type": "text" },
        "event_data": { "type": "object", "enabled": false },
        "user_email": {
          "type": "text",
          "fields": { "keyword": { "type": "keyword" } }
        },
        "user_id": { "type": "text" }
      }
    }
  }
}

this is my query

{
  "from": 0,
  "size": "10",
  "track_total_hits": true,
  "_source": [
    "user_email",
    "user_id",
    "activity_date",
    "activity_type",
    "description",
    "client_id",
    "id"
  ],
  "query": {
    "bool": {
      "must": [
        {
          "query_string": {
            "query": "*[email protected]*",
            "default_field": "user_email",
            "default_operator": "OR"
          }
        }
      ]
    }
  },
  "sort": [{ "activity_date": "desc" }]
}

it is not working with exact match eg:-ashutosh.pandya it is returning results but for [email protected] it is not returning results i have followed this blog also medium blog and created new mapping with custom email analyzer it did not work for me i dont know what i am doing wrong

I have tried this query to get all the logs from [email protected] but did not get any hits

{
    "from":0,
    "size":"10",
    "track_total_hits":True,
    "_source":[
       "user_email",
       "user_id",
       "activity_date",
       "activity_type",
       "description",
       "client_id",
       "id"
    ],
    "query":{
       "bool":{
          "must":[
             {
                "query_string":{
                   "query":"*[email protected]*",
                   "default_field":"user_email",
                   "default_operator":"OR"
                }
             }
          ]
       }
    },
    "sort":[
       {
          "activity_date":"desc"
       }
    ]
 }

but when i search this query

{
    "from":0,
    "size":"10",
    "track_total_hits":True,
    "_source":[
       "user_email",
       "user_id",
       "activity_date",
       "activity_type",
       "description",
       "client_id",
       "id"
    ],
    "query":{
       "bool":{
          "must":[
             {
                "query_string":{
                   "query":"*ashutosh.pandya*",
                   "default_field":"user_email",
                   "default_operator":"OR"
                }
             }
          ]
       }
    },
    "sort":[
       {
          "activity_date":"desc"
       }
    ]
 }

i got all the hits in which user_email contains ashutosh.pandya I want this :- if I search ashutosh i got all the hits where user emali contain ashutosh if I search ashu i got all the hits where user email contain ashu if I search for pandya i got all the hits where user email contains pandya if I search [email protected] i got all the hits where user email equal to [email protected] if i search for domain i got all the hits where user email contains domain

CodePudding user response:

You don't need a custom analyzer for wildcard matches. You don't really need your email to be split into tokens at all so use keyword type for email in the mapping or use email.keyword when searching.

CodePudding user response:

i have solved this issue by creating a pattern capture token filter this is the document link elasticsearch pattern capture token filter

  • Related