complete_list = []
for item in soup2.findAll("div", {"class": "body"}):
sub_item = soup2.find('ul',class_="home").find_all('li')
for x in sub_item:
for x in soup2.findAll("b"): x.decompose() # removes the b tag
try:
a = soup2.find('ul',class_="list m-b-0").find_all('li')[0].text
b = soup2.find('ul',class_="list m-b-0").find_all('li')[1].text
c = soup2.find('ul',class_="list m-b-0").find_all('li')[2].text
d = soup2.find('ul',class_="list m-b-0").find_all('li')[3].text
e = soup2.find('ul',class_="list m-b-0").find_all('li')[4].text
f = soup2.find('ul',class_="list m-b-0").find_all('li')[5].text
g = soup2.find('ul',class_="list m-b-0").find_all('li')[6].text
h = soup2.find('ul',class_="list m-b-0").find_all('li')[7].text
except IndexError:
print('No Data')
complete_list.append((a,b,c,d,e,f,g,h))
df = pd.DataFrame(complete_list, columns=['a','b','c','d','e','f','g','h'])
df.to_csv('doublesix_.csv', index=False, encoding='utf-8')
next_page = soup.select_one('li.page-item.next>a')
if next_page:
next_url = next_page.get('href')
url = urljoin(url, next_url)
else:
break
Hi, Guys im very tire of finding the issues, i want to get and save the data on csv file but my the problem is im getting only the first line. and the other is missing.
how to save all the data?
I just added the latest code i have im but not getting the right data on csv file
UPDATED: SOLVED
CodePudding user response:
value = first_name # list of name, degree, score nme = value
value
and nme
don't seem necessary at all (why are you doing this?); also,
it's not the best idea to name variable dict
/list
/etc. since they already mean something.
but my the problem is im getting only the first line
That is actually surprising - I'd expect only the last line to be saved, since you are writing to the csv inside the loop, and it's not in append mode [with mode='a'
like .to_csv('acs.csv', mode='a')
] so the file is being over-written every time.
Anyways, you could still try saving it outside the loop:
namesList = [] # initiate list of names
for item in soup2.findAll("div", {"class": "box-body light-b"}):
sub_item = soup2.find('ul',class_="list m-b-0").find_all('li')
for x in sub_item:
for x in soup2.findAll("b"): x.decompose() # removes the b tag
try: first_name = soup2.find('ul',class_="list m-b-0").find_all('li')[0].text
except IndexError:
print('No Data')
first_name = None # otherwise you'll be repeating the previous first_name
namesList.append(first_name) # add to list of names
df = pd.DataFrame({'name': [namesList]})
# saving the dataframe
df.to_csv('acs.csv')