Home > OS >  How to replace date string and parse xml data in python using elementtree?
How to replace date string and parse xml data in python using elementtree?

Time:10-07

I have xml data that is located at below link. The report gets updated everyday so from the link the only thing that changes is date. I would like to achieve below things

Note: Below link and XML data is for demo purpose

XML Link:

https://www.mywesbite.com/report/[10/01/2021"}

XML Data

b'<?xml version=\'1.0\' encoding=\'UTF-8\'?><results exportTime="Date time"><report label="Report Title"><record report_date="10/01/2021" narrative="null"><report label="report label"></report>

Things needed: Replace the date in the above link with today's date and see if the today's date matches report_date. If report_date matches with today's date parse today's report else parse yesterday's report. If today_date is not present parse yesterday's report as well.

For replacing the date I have below code, which replaces the date. However, I am not sure how to achieve above thing mentioned in Things needed. I am novice python programmer.

Thanks in advance for your time!

from datetime import date,timedelta
today1 = date.today().strftime("%m/%d/%Y") #gets today's date
yesterday = datetime.now() - timedelta(1)
yesterday=datetime.strftime(yesterday, '%m/%d/%Y') #gets yesterday's date

url_to_replace="https://www.mywesbite.com/report/[10/01/2021"}"
url_to_replace=url_to_replace.replace('10/1/2021',yesterday)

CodePudding user response:

You don't have to replace it - you can use strftime with full url and with %m/%d/%Y in this url.

Because % has special meaning in strftime so it needs double %% in %[ and %"%}

from datetime import datetime, timedelta

today = datetime.now()
yesterday = today - timedelta(days=1)

url = yesterday.strftime("https://www.mywesbite.com/report/%[%m/%d/%Y%"%}")

print('    today:', today.strftime('%m/%d/%Y'))
print('yesterday:', yesterday.strftime('%m/%d/%Y')) 
print('      url:', url)

Result:

    today: 10/06/2021
yesterday: 10/05/2021
      url: https://www.mywesbite.com/report/[10/05/2021"}

OR you could use f-string to put date in correct place

yesterday_str = yesterday.strftime("%m/%d/%Y")
url = f"https://www.mywesbite.com/report/[{yesterday_str}"}"

Add here example code which gets report_date from xml

data = b'''<?xml version=\'1.0\' encoding=\'UTF-8\'?>
<results exportTime="Date time">
<report label="Report Title">
<record report_date="10/01/2021" narrative="null">
<report label="report label">
</report>
</record>
</report>
</results>
'''

from xml.etree import ElementTree as ET
from datetime import datetime, timedelta

root = ET.fromstring(data)

report_date = root.find('.//record[@report_date]')

if report_date:  # if not None then get attribue                         
    report_date = report_date.attrib['report_date']

today_str = datetime.now().strftime('%m/%d/%Y')

print('report_date:', report_date)
print('      today:', today_str)
print('   the same:', report_date == today_str)

if report_date == today_str:
    print("parse today's report")
else:
    print("parse yesterday's report")

Result:

report_date: 10/01/2021
      today: 10/06/2021
   the same: False

parse yesterday's report
  • Related