Home > Back-end >  Selenium Data scraping issue, unproper data scrapped
Selenium Data scraping issue, unproper data scrapped

Time:11-29

I am trying to scrape data from:- op1


You could also get separate columns for each tablet-count-option, if you remove

            pDet[pvrStrength] = ', '.join([
                pvOp.get_text(' ').strip() for pvOp in pvRow.select(opSel)
            ]) 

and replace it with this loop:

            for pvoi, pvOp in enumerate(pvRow.select(opSel)):  
                pvoTxt = pvOp.get_text(' ').strip()
                tabletCt = pvoTxt.split(' - ')[0]
                pvoPrice = pvoTxt.split(' - ')[-1]
                if not tabletCt.endswith(' tablets'): 
                    tabletCt = f'[option {pvoi   1}]'    
                    pvoPrice = pvoTxt
                
                pDet[f'{pvrStrength} - {tabletCt}'] = pvoPrice 
index Abilify (Aripiprazole) Generic Equivalent - Abilify (Aripiprazole) Generic Equivalent - Accolate (Zafirlukast) Abilify ODT (Aripiprazole) Generic Equivalent - Abilify ODT (Aripiprazole)
product_endpt abilify-tablet abilify-tablet accolate abilify-mt abilify-mt
brand_or_generic Brand Generic Generic Brand Generic
rx_requirement Prescription Required NaN NaN Prescription Required NaN
2mg - 30 tablets $219.99 NaN NaN NaN NaN
2mg - 90 tablets $526.99 NaN NaN NaN NaN
5mg - 28 tablets $160.99 NaN NaN NaN NaN
5mg - 84 tablets $459.99 NaN NaN NaN NaN
10mg - 28 tablets $116.99 NaN NaN NaN NaN
10mg - 84 tablets $162.99 NaN NaN NaN NaN
15mg - 28 tablets $159.99 NaN NaN NaN NaN
15mg - 84 tablets $198.99 NaN NaN NaN NaN
20mg - 90 tablets $745.99 $67.99 NaN NaN NaN
30mg - 28 tablets $104.99 NaN NaN NaN NaN
30mg - 84 tablets $289.99 $75.99 NaN NaN NaN
1mg/ml Solution - [option 1] 150 ml - $239.99 NaN NaN NaN NaN
2mg - 100 tablets NaN $98.99 NaN NaN NaN
5mg - 100 tablets NaN $43.99 NaN NaN NaN
10mg - 90 tablets NaN $38.59 NaN NaN NaN
15mg - 90 tablets NaN $56.59 NaN NaN NaN
10mg - 60 tablets NaN NaN $109.00 NaN NaN
20mg - 60 tablets NaN NaN $109.00 NaN NaN
10mg ODT - 84 tablets NaN NaN NaN $499.99 NaN
15mg ODT - 84 tablets NaN NaN NaN $499.99 NaN
5mg ODT - 90 tablets NaN NaN NaN NaN $59.00
20mg ODT - 90 tablets NaN NaN NaN NaN $89.00
30mg ODT - 150 tablets NaN NaN NaN NaN $129.99
source_url https://www.canadapharmacy.com/products/abilify-tablet https://www.canadapharmacy.com/products/abilify-tablet https://www.canadapharmacy.com/products/accolate https://www.canadapharmacy.com/products/abilify-mt https://www.canadapharmacy.com/products/abilify-mt

(I transposed the table since there were so many columns and so few rows. Table markdown can be copied from output of print(pricesDf.T.to_markdown()))

  • Related