Home > Enterprise >  All the information is getting inside only in one block of excel python
All the information is getting inside only in one block of excel python

Time:10-01

Hello I am new to the web scraping. I scrap a site but after I write it into the CSV only one block is filled with all the information and I want the information to be filled in row wise it will not matter if they are in single but they must be in different row. Here is the code:

import requests
from bs4 import BeautifulSoup
import pandas as pd
import re
from csv import writer

url = 'https://virtualhs.pwcs.edu/about/faculty'
res = requests.get(url)

soup = BeautifulSoup(res.content, 'lxml')
Title = soup.find('div', id='divContent')
if Title:
    for p in Title.select("p"):
        p.extract()
    for h2 in Title.select("h2"):
        h2.extract()
        Title =Title.text
        print(Title)
        with open('now.csv','w',encoding='utf-8', newline='') as f:
            thewriter = writer(f)
            thewriter.writerow([Title])

CodePudding user response:

It's not clear what you are trying to scrape. Is it the teachers and the department?

If you use 'w' as your parameter, it will overwrite after each iteration. You would need to use 'a' to append after each iteration, but need to also make sure you write an initial "blank" csv to append to.

Personally, I think it's just easier to construct a dataframe, then write that to file:

import requests
from bs4 import BeautifulSoup
import pandas as pd

url = 'https://virtualhs.pwcs.edu/about/faculty'
res = requests.get(url)

soup = BeautifulSoup(res.content, 'html.parser')
departments = soup.find_all('h3')

rows = []
for department in departments:
    for teacher in department.find_next('ul').find_all('li'):
        row = {
            'teacher':teacher.text,
            'department':department.text}
        rows.append(row)
    
df = pd.DataFrame(rows)
df.to_csv('now.csv', index=False)    
  • Related