How to pare JSON file starts with b'"-CodePudding

I am getting this string when we read s3 object as JSON

json_object = s3_client.get_object(Bucket=bucket,Key=json_file_name)
print(json_object)
jsonFileReader = json_object['Body'].read()
jsonDict = json.loads(jsonFileReader)

The data looks like this when i printed

raw = b'"[{\\"timestamp\\": \\"2022-07-27T12:34:52.304000 00:00\\", \\"type\\": \\"ExecutionSucceeded\\", \\"id\\": 9, \\"previousEventId\\": 8, \\"executionSucceededEventDetails\\": {\\"output\\": \\"{\\\\\\"statusCode\\\\\\":200,\\\\\\"body\\\\\\":\\\\\\"\\\\\\\\\\\\\\"Hello from Lambda!\\\\\\\\\\\\\\"\\\\\\"}\\", \\"outputDetails\\": {\\"truncated\\": false}}}]"'

I want to extract type out of it .

But when i do it i get error

    status=data[0]['type']
TypeError: string indices must be integers

My code

raw = b'"[{\\"timestamp\\": \\"2022-07-27T12:34:52.304000 00:00\\", \\"type\\": \\"ExecutionSucceeded\\", \\"id\\": 9, \\"previousEventId\\": 8, \\"executionSucceededEventDetails\\": {\\"output\\": \\"{\\\\\\"statusCode\\\\\\":200,\\\\\\"body\\\\\\":\\\\\\"\\\\\\\\\\\\\\"Hello from Lambda!\\\\\\\\\\\\\\"\\\\\\"}\\", \\"outputDetails\\": {\\"truncated\\": false}}}]"'

data = json.loads(raw.decode('utf-8'))

print(data)
print(data[0])
status=data[0]['type']
print(status)

CodePudding user response：

Your decoded raw represents a string, not a json object (notice that the first and last characters of raw are quotes ").

When you do data = json.loads(raw.decode('utf-8')), you have type(data) == str, i.e. data is the string "[{\\"timestamp\\": \\"2022-...", which happens to itself be a json string.

To deserialize this string, json.loads it again:

data = json.loads(data)

And now use it:

print(data[0]['type'])
# prints ExecutionSucceeded