I have 4 XML files inside the S3 bucket directory. When I'm trying to read the content of all the files, I find that only the content of the last file (XML4) is getting stored.
s3_bucket_name='test'
bucket=s3.Bucket(s3_bucket_name)
bucket_list = []
for file in bucket.objects.filter(Prefix = 'auto'):
file_name=file.key
if file_name.find(".xml")!=-1:
bucket_list.append(file.key)
In the 'bucket_list', I can see that there are 4 files
for file in bucket_list:
obj = s3.Object(s3_bucket_name,file)
data = (obj.get()['Body'].read())
tree = ET.ElementTree(ET.fromstring(data))
What changes should be made in the code to read the content of all the XML files?
CodePudding user response:
As mentioned, since you have a list of files, you need a corresponding list of trees.
tree_list = []
for file in bucket_list:
obj = s3.Object(s3_bucket_name,file)
data = (obj.get()['Body'].read())
tree_list.append(ET.ElementTree(ET.fromstring(data)))
Then you can start using tree_list
for whatever purpose.