I am getting a json file from S3 using boto3 get_object. I need to get the contents from the file and loop through the array of objects and get one object at a time. When I loop through I get one character per iteraration.
import json import boto3
s3 = boto3.client('s3') session = boto3.Session()
def lambda_handler(event, context):
bucket = event["bucket"]
key = event["key"]
data = s3.get_object(Bucket=bucket, Key=key)
contents = data['Body'].read()
test = contents.decode("utf-8")
# convert contents to native python string representing json object
s3_string = json.dumps(contents.decode("utf-8"))
# return dict
s3_dict = json.loads(s3_string)
# this seems to output valid json
# print(str(s3_dict))
for item in s3_dict:
print(item)
The json in the file is formatted as follows
[{
"location": "123 Road Dr",
"city_state": "MyCity ST",
"phone": "555-555-5555",
"distance": "1"
},
{
"location": "456 Avenue Crt",
"city_state": "MyTown AL",
"phone": "555-867-5309",
"distance": "0"
}
]
This is what I get (one character per iteration)...
- [
- {
- " ...
This is what I need (in json format)...
1st loop
{
"location": "123 Road Dr",
"city_state": "MyCity ST",
"phone": "555-555-5555",
"distance": "1"
}
2nd Loop
{
"location": "456 Avenue Crt",
"city_state": "MyTown AL",
"phone": "555-867-5309",
"distance": "0"
}
Can someone tell me where I'm going wrong?
Thanks in advance.
CodePudding user response:
This was the working solution.
def lambda_handler(event, context):
bucket = event["bucket"]
key = event["key"]
data = s3.get_object(Bucket=bucket, Key=key)
contents = data['Body'].read()
# convert contents to native python string representing json object
s3_string = contents.decode("utf-8")
# check the "type" of s3_string - in this case it is <class 'str'>
print("s3_string is " str(type(s3_string)))
# return python list
s3_list = json.loads(s3_string)
# check the "type" of s3_list - in this case it is <class 'list'>
print("s3_list is " str(type(s3_list)))
# this returns valid json for every object in the array in original json file.
for item in s3_list:
print(json.dumps(item))
I had assumed I was getting a python dict as this in the default behavior of json.loads. I was in fact getting a list. This explains why... Json file gets load as list instead of dict. using python