Home > Blockchain >  regex Y-M-D extraction from web scraping
regex Y-M-D extraction from web scraping

Time:11-17

I wouldl like to extract the Y-M-D information from the following html.

Created at</th><td><span><time datetime="2001-06-01"
date= [re.search("Created at</th><td><span><time datetime=([0-9A-Za-z\&;]*)", address).group(1)]
date

I have tried this code but it does not work.Do you have any ideas?

CodePudding user response:

The first argument in re.search should be the pattern and the second the string you want to extract from.

You can start trying something like:

re.search("\d{4}-\d{2}-\d{2}", 'Created at</th><td><span><time datetime="2001-06-01"')

And then use groups

CodePudding user response:

Try using a capturing group to isolate the date portion of the regex pattern.

date = re.search(r'time datetime="(\d{4}-\d{2}-\d{2})"', address)
print(date.groups())

output:

('2001-06-01')
  • Related