I want to crawl this website http://www.truellikon.ch/freizeit-kultur/anlaesse-agenda.html . I want to extract date and time of each event. You can see that date is listed above events. In order to extract date and time I need to combine different divs, but the problem is that I do not have 'container' for group of events that are on the same date. So the only thing that I can do is to extract all events that are between two divs that refer to date.
This is the code for extracting the event info:
from bs4 import BeautifulSoup
import requests
domain = 'truellikon.ch'
url = 'http://www.truellikon.ch/freizeit-kultur/anlaesse-agenda.html'
def get_website_news_links_truellikonCh():
response = requests.get(url, allow_redirects=True)
print("Response for", url, response)
soup = BeautifulSoup(response.content, 'html.parser')
all_events = soup.select('div.eventItem')
for i in all_events:
print(i)
print()
input()
x = get_website_news_links_truellikonCh()
Class name for date is 'listThumbnailMonthName'
My question is how can I combine these divs, how can I write the selectors so that I can get exact date and time, title and body of each event
CodePudding user response:
you have one parent container which is #tx_nezzoagenda_list
and then you have to read the children
one by one
import re
from bs4 import BeautifulSoup
import requests
url = 'http://www.truellikon.ch/freizeit-kultur/anlaesse-agenda.html'
response = requests.get(url)
soup = BeautifulSoup(response.text, 'html.parser')
container = soup.select_one('#tx_nezzoagenda_list')
for child in container.children:
if not child.name:
continue
if 'listThumbnailMonthName' in child.get('class'):
base_date=child.text.strip()
else:
day=child.select_one('.dateDayNumber').text.strip()
title=child.select_one('.titleText').text.strip()
locationDate=child.select_one('.locationDateText').children
time=list(locationDate)[-1].strip()
time=re.sub('\s','', time)
print(title, day, base_date, time)
which outputs
Abendunterhaltung TV Trüllikon 10 Dezember 2021 19:00Uhr-3:00Uhr
Christbaum-Verkauf 18 Dezember 2021 9:30Uhr-11:00Uhr
Silvester Party 31 Dezember 2021 22:00Uhr
Neujahrsapéro 02 Januar 2022 16:00Uhr-18:00Uhr
Senioren-Zmittag 21 Januar 2022 12:00Uhr-15:00Uhr
Theatergruppe "Nume Hüür", Aufführung 23 Januar 2022 13:00Uhr-16:00Uhr
Elektroschrottsammlung 29 Januar 2022 9:00Uhr-12:00Uhr
Senioren Z'mittag 18 Februar 2022 12:00Uhr-15:00Uhr
Frühlingskonzert 10 April 2022 12:17Uhr
Weinländer Musiktag 22 Mai 2022 8:00Uhr
Auffahrtskonzert Altersheim 26 Mai 2022 10:30Uhr
Feierabendmusik und Jubilarenehrung 01 Juli 2022 19:00Uhr
Feierabendmusik 15 Juli 2022 12:24Uhr
Feierabendmusik 19 August 2022 19:00Uhr
Herbstanlass 19 November 2022 20:00Uhr