Home > Software engineering >  Python serialize/format text complex to json object
Python serialize/format text complex to json object

Time:05-03

have this message in request.

event= [
     {'payload': 
        {
         'key': 'eyJ', 
         'value': '{ApproximateReceiveCount=1, SentTimestamp=193, SequenceNumber=100, MessageGroupId=1d4, SenderId=ARO:5c9, MessageDeduplicationId=c29e, Body={"token":"cb206","method":"P","methodName":"SP","reference":"814","bk":"ST","status":4,"pmt":{"token":"c182","items":[{"number":"1","description":"desc"}],"amount":{"amount":1,"currency":"4"}},"createdAt":"2022-05-01T15:00:10-05:00","updatedAt":"2022-05-01T15:00:10-05:00","expiresAt":"2022-05-01T00:00:00-05:00"}, ApproximateFirstReceiveTimestamp=12}',
         'timestamp': 13, 
         'topic': 'X-hub', 
         'partition': 1, 
         'offset': 0, 
         'headers': 
            [
                {
                 'key': 'ptj', 
                 'value': 'c3'
                 }, 
                 {
                 'key': 'tk', 
                 'value': '62'
                 }, 
                 {
                'key': 'pp', 
                'value': 'pc'
                }, 
                {
                'key': 'ps', 
                'value': 'completed'
                }, 
                {
                'key': 'pt', 
                'value': 'pv'
                }
            ]
        }
    }
]

i can get event[0]["payload"]["value"] but I need serialize only "body={}".

my problem is "value" contains attributes like "ApproximateReceiveCount=1, ..., body='{}'"

how can I solve it without use replace or extract text?

CodePudding user response:

So this is an interesting question, which you can solve by counting the opening and closing curly brackets.

Split the value string on the Body= part, and then with a for loop you go over the remaining characters, counting the opening and closing curly brackets. Once this counter hits zero, you know you got to the last part of the JSON string.

Note, in the logic below, I've put the value of event[0]["payload"]["value"] into the string value.

import json
value = '''{ApproximateReceiveCount=1, SentTimestamp=193, SequenceNumber=100, MessageGroupId=1d4, SenderId=ARO:5c9, MessageDeduplicationId=c29e, Body={"token":"cb206","method":"P","methodName":"SP","reference":"814","bk":"ST","status":4,"pmt":{"token":"c182","items":[{"number":"1","description":"desc"}],"amount":{"amount":1,"currency":"4"}},"createdAt":"2022-05-01T15:00:10-05:00","updatedAt":"2022-05-01T15:00:10-05:00","expiresAt":"2022-05-01T00:00:00-05:00"}, ApproximateFirstReceiveTimestamp=12}'''

def get_body_json(input_string: str) -> dict:
    counter = 0
    result = ''
    # split the string on the "Body=" and iterate over the remaining characters in it
    for c in input_string.split("Body=")[1]:
        if c == '{':
            # increase the counter when we get an opening curly bracket
            counter  = 1
        elif c == '}':
            # decrease the counter when we get a closing curly bracket
            counter -= 1
        # add the character to the result string
        result = result   c
        # if the counter hits zero, we know we got the last closing bracket, so that's
        # the end of the Body part of the value
        if counter == 0:
            break
    # return the result with json.loads(result)
    return json.loads(result)


print(json.dumps(get_body_json(value), indent=4))

output

{
    "token": "cb206",
    "method": "P",
    "methodName": "SP",
    "reference": "814",
    "bk": "ST",
    "status": 4,
    "pmt": {
        "token": "c182",
        "items": [
            {
                "number": "1",
                "description": "desc"
            }
        ],
        "amount": {
            "amount": 1,
            "currency": "4"
        }
    },
    "createdAt": "2022-05-01T15:00:10-05:00",
    "updatedAt": "2022-05-01T15:00:10-05:00",
    "expiresAt": "2022-05-01T00:00:00-05:00"
}
  • Related