Home > Blockchain >  How to get the local day of week from timestamp in elasticsearch
How to get the local day of week from timestamp in elasticsearch

Time:11-10

I'm using the ingest pipeline script processors to extract the day of the week from the local time for each document.

I'm using the client_ip to extract the timezone, use that along with the timestamp to extract the local time, and then extract day of week (and other features) from that local time.

This is my ingest pipeline:

{
    "processors" : [
      {
        "set" : {
          "field" : "@timestamp",
          "override" : false,
          "value" : "{{_ingest.timestamp}}"
        }
      },
      {
        "date" : {
          "field" : "@timestamp",
          "formats" : [
            "EEE MMM dd HH:mm:ss 'UTC' yyyy"
          ],
          "ignore_failure" : true,
          "target_field" : "@timestamp"
        }
      },
      {
        "convert" : {
          "field" : "client_ip",
          "type" : "ip",
          "ignore_failure" : true,
          "ignore_missing" : true
        }
      },
      {
        "geoip" : {
          "field" : "client_ip",
          "target_field" : "client_geo",
          "properties" : [
            "continent_name",
            "country_name",
            "country_iso_code",
            "region_iso_code",
            "region_name",
            "city_name",
            "location",
            "timezone"
          ],
          "ignore_failure" : true,
          "ignore_missing" : true
        }
      },
      {
        "script" : {
          "description" : "Extract details of Dates",
          "lang" : "painless",
          "ignore_failure" : true,
          "source" : """
            LocalDateTime local_time LocalDateTime.ofInstant( Instant.ofEpochMilli(ctx['@timestamp']), ZoneId.of(ctx['client_geo.timezone']));
            int day_of_week = local_time.getDayOfWeek().getValue();
            int hour_of_day = local_time.getHour();
            int office_hours = 0;
            if (day_of_week<6 && day_of_week>0) { if (hour_of_day >= 7 && hour_of_day <= 19 ) {office_hours =1;}  else {office_hours = -1;}} else {office_hours = -1;}
            ctx['day_of_week'] = day_of_week;
            ctx['hour_of_day'] = hour_of_day;
            ctx['office_hours'] = office_hours;
          """
        }
      }
    ]
}

The first two processors were added before for other purposes. I've added the last 3.

An example document could be the following:

  "docs": [
    {
      "_source": {
        "@timestamp": 43109942361111,
        "client_ip": "89.160.20.128"
      }
    }
  ]

I'm getting the GeoIP fields in the data now, but none of the fields created by the script processor. What am I doing wrong?

EDIT A few notes about the index that is affected by these changes: The Dynamic mapping is off. I have manually added the client_geo.timezone field to the mapping of the index as a keyword. When I run the following scripted search on the index

GET index_name/_search
{
 "script_fields": {
  "day_of_week": {
    "script": "doc['@timestamp'].value.withZoneSameInstant(ZoneId.of(doc['client_geo']['timezone'])).getDayOfWeek().getValue()"
  }
 }
}

I get the following runtime error in script execution:

          "caused_by" : {
            "type" : "illegal_argument_exception",
            "reason" : "No field found for [client_geo] in mapping"
          }

CodePudding user response:

Thanks for a well formed question example.

I was able to replicate your problem and figured it out.

ctx is "The document source as it is". Consequently, ingest does not automatically dig-up dot-delimited fields.

Your client data is added as such:

"client_geo" : {
   "continent_name" : "Europe"
   //<snip>..</snip>
}

So, you have to access it directly as a nested hash map.

Meaning ctx['client_geo.timezone'] should actually be ctx['client_geo']['timezone']

Here is the full pipeline that worked for me:

"processors": [
      {
        "set": {
          "field": "@timestamp",
          "override": false,
          "value": "{{_ingest.timestamp}}"
        }
      },
      {
        "date": {
          "field": "@timestamp",
          "formats": [
            "EEE MMM dd HH:mm:ss 'UTC' yyyy"
          ],
          "ignore_failure": true,
          "target_field": "@timestamp"
        }
      },
      {
        "convert": {
          "field": "client_ip",
          "type": "ip",
          "ignore_failure": true,
          "ignore_missing": true
        }
      },
      {
        "geoip": {
          "field": "client_ip",
          "target_field": "client_geo",
          "properties": [
            "continent_name",
            "country_name",
            "country_iso_code",
            "region_iso_code",
            "region_name",
            "city_name",
            "location",
            "timezone"
          ],
          "ignore_failure": true,
          "ignore_missing": true
        }
      },
      {
        "script": {
          "description": "Extract details of Dates",
          "lang": "painless",
          "ignore_failure": true,
          "source": """
            LocalDateTime local_time = LocalDateTime.ofInstant(Instant.ofEpochMilli(ctx['@timestamp']), ZoneId.of(ctx['client_geo']['timezone']));
            int day_of_week = local_time.getDayOfWeek().getValue();
            int hour_of_day = local_time.getHour();
            int office_hours = 0;
            if (day_of_week<6 && day_of_week>0) { if (hour_of_day >= 7 && hour_of_day <= 19 ) {office_hours =1;}  else {office_hours = -1;}} else {office_hours = -1;}
            ctx['day_of_week'] = day_of_week;
            ctx['hour_of_day'] = hour_of_day;
            ctx['office_hours'] = office_hours;
          """
        }
      }
    ]
  • Related