Home > Back-end >  How to filter API response data based on particular time range using python
How to filter API response data based on particular time range using python

Time:02-23

I am using one lambda python function to get email logs from mailgun using mailgun log API. Here is my function,

import json
import requests
resp = requests.get("https://api.eu.mailgun.net/v3/domain/events",
                    auth=("api","key-api"))
def jprint(obj):
    # create a formatted string of the Python JSON object
    text = json.dumps(obj, sort_keys=True, indent=4)
    print(text)
jprint(resp.json())

This function gives formatted json output of email logs fetched from mailgun API

sample response from API,

{
    "items": [
        {
            "campaigns": [],
            "delivery-status": {
                "attempt-no": 1,
                "certificate-verified": true,
                "code": 250,
                "description": "",
                "message": "OK",
                "mx-host": "host",
                "session-seconds": 1.5093050003051758,
                "tls": true
            },
            "envelope": {
                "sender": "[email protected]",
                "sending-ip": "ip",
                "targets": "[email protected]",
                "transport": "smtp"
            },
            "event": "delivered",
            "flags": {
                "is-authenticated": true,
                "is-routed": false,
                "is-system-test": false,
                "is-test-mode": false
            },
            "id": "id",
            "log-level": "info",
            "message": {
                "attachments": [],
                "headers": {
                    "from": "NAME <[email protected]>",
                    "message-id": "[email protected]",
                    "subject": "Client due diligence information has been submitted by one of your customers.",
                    "to": "[email protected]"
                },
                "size": 1990
            },
            "recipient": "[email protected]",
            "recipient-domain": "domain.com",
            "storage": {
                "key": "key",
                "url": "https://storage.eu.mailgun.net/v3/domains/domain/messages/id"
            },
            "tags": [],
            "timestamp": 1645603109.434181,
            "user-variables": {}
        },
        {
            "envelope": {
                "sender": "[email protected]",
                "targets": "[email protected]",
                "transport": "smtp"
            },
            "event": "accepted",
            "flags": {
                "is-authenticated": true,
                "is-test-mode": false
            },
            "id": "id",
            "log-level": "info",
            "message": {
                "headers": {
                    "from": "NAME <[email protected]>",
                    "message-id": "[email protected]",
                    "subject": "Client due diligence information has been submitted by one of your customers.",
                    "to": "[email protected]"
                },
                "size": 1990
            },
            "method": "HTTP",
            "recipient": "[email protected]",
            "recipient-domain": "domain",
            "storage": {
                "key": "key",
                "url": "https://storage.eu.mailgun.net/v3/domains/domain/messages/key"
            },
            "tags": null,
            "timestamp": 1645603107.282775,
            "user-variables": {}
        },

Here timestamp is not human readable

I need to setup the aws lambda python script to trigger the event to call the mailgun API periodically and send the logs to cloudwatch. I am familiar with setup but not with script.

Now I need to filter the API data only for last one hour dynamically.

From the analysis using pandas library this can be achieved but I couldn't get the proper answer to get logs for dynamic time range periodically.

I referred many docs about this but I cannot find proper answer and also python is totally new for me.

Can anyone please guide me how can i get the logs from last N time range dynamically?

CodePudding user response:

In the documentation of mailgun, you can specify a timerange, so your result can already be filtered using begin and end parameters.

After that, you can use pd.json_normalize to reshape your json response.

CodePudding user response:

In addition to what @Corralien said about documentation, which I personally prefer, you can use a pure python approach to reselct the last hour data using a list comprehension. In the code below, I am going to assume you named API's response as data which should be dictionary:

from time import time
lastHour = time() - 3600
[x for x in data["items"] if x["timestamp"] > lastHour]

This would filter the values with a timestamp greater than the last hour(time() - 3600).

CodePudding user response:

In addition to above answers, for filtering between two time and date ranges using only python, you could use datetime. Here using the same list comprehension as @Amirhossein Kiani.:

import datetime

start = datetime.datetime(year, month, day, hour, minute, second).timestamp()
stop = datetime.datetime(year, month, day, hour, minute, second).timestamp()

[x for x in data["items"] if start < x["timestamp"] < stop]

For the one hour difference, you could also use timedelta:

start = (datetime.datetime.now() - datetime.timedelta(hours=1)).timestamp()
stop = datetime.datetime.now().timestamp()
  • Related