Home > Net >  Extract very nested string-text between <font> tags?
Extract very nested string-text between <font> tags?

Time:11-25

I'm trying to make list of minerals prices. I Succeed at making first step (it shows list of minerals from 1st page), but I can't reach for Price values. I've tried with some other methods I've found on StackOverflow (with siblings/parents tags etc.) but I didn't succeed... Also, can you later attach/add one list to another (name price) if I use two 'for' loops?

Below is a fragment I want to reach, it is between tags. Using find by 'font' wasn't successful for me... I don't really need "Price" text, but I do need "€580 / US$598 / ¥84010 / AUD$890". `

import requests
from bs4 import BeautifulSoup

URL = "https://www.fabreminerals.com/search_results.php?LANG=EN&SearchTerms=&submit=Buscar&MineralSpeciment=&Country=&Locality=&PriceRange=&checkbox=enventa&First=0"

headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.190 Safari/537.36'}

page = requests.get(URL, headers=headers)
print(page.content)

soup = BeautifulSoup(page.content, 'html5lib')
table = soup.find('a', attrs = {'name':'SearchTop'})
print(table.prettify())


for names in table.find_all('img', alt=True):
     print(names['alt'])

print(soup.find_all('font'))
          <font face="Arial, Helvetica, sans-serif" size="-1">
           <font color="#FF0000">
            Price:
           </font>
           €580 / US$598 / ¥84010 / AUD$890
          </font>

`

CodePudding user response:

Try this:

import requests
from bs4 import BeautifulSoup

URL = "https://www.fabreminerals.com/search_results.php?LANG=EN&SearchTerms=&submit=Buscar&MineralSpeciment=&Country=&Locality=&PriceRange=&checkbox=enventa&First=0"

headers = {
    "User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.114 Safari/537.36",
}

soup = BeautifulSoup(requests.get(URL, headers=headers).text, "lxml")

prices = [
    p.getText(strip=True).split("Price:")[-1] for p
    in soup.select("table tr td font font")
]
names = [n.getText(strip=True) for n in soup.select("table tr td font a")]

# Filter the lists
names[:] = [" ".join(n.split()) for n in names if not n.startswith("[")]
prices[:] = [p for p in prices if p]

# Print the results
for name, price in zip(names, prices):
    print(f"{name}\n{price}")
    print("-" * 50)

Output:

NX51AH2: 'lepidolite' after Elbaite with Elbaite
€580 / US$598 / ¥84010 / AUD$890
--------------------------------------------------
TH27AL9: 'Pearceite' with Calcite
€220 / US$227 / ¥31860 / AUD$330
--------------------------------------------------
TFM69AN5: 'Stilbite'
€450 / US$464 / ¥65180 / AUD$690
--------------------------------------------------
SM90CEX: Acanthite
€90 / US$92 / ¥13030 / AUD$130
--------------------------------------------------
TMA97AN5: Acanthite
€240 / US$247 / ¥34760 / AUD$370
--------------------------------------------------
TB90AE8: Acanthite
€540 / US$557 / ¥78220 / AUD$830
--------------------------------------------------
TZ71AK9: Acanthite
€580 / US$598 / ¥84010 / AUD$890
--------------------------------------------------
EC63G1: Acanthite
€85 / US$87 / ¥12310 / AUD$130
--------------------------------------------------
MN56K9: Acanthite
€155 / US$159 / ¥22450 / AUD$230
--------------------------------------------------
TF89AL3: Acanthite (Se-bearing) with Polybasite (Se-bearing) and Calcite
€460 / US$474 / ¥66630 / AUD$700
--------------------------------------------------
TP66AJ8: Acanthite (Se-bearing) with Pyrite
€1500 / US$1547 / ¥217290 / AUD$2310
--------------------------------------------------
TY86AN2: Acanthite after Polybasite
€1600 / US$1651 / ¥231770 / AUD$2460
--------------------------------------------------
TA66AF6: Acanthite with Calcite
€160 / US$165 / ¥23170 / AUD$240
--------------------------------------------------
JFD104AO2: Acanthite with Calcite
€240 / US$247 / ¥34760 / AUD$370
--------------------------------------------------
TX36AL6: Acanthite with Calcite
€1200 / US$1238 / ¥173830 / AUD$1850
--------------------------------------------------
TA48AH1: Acanthite with Chalcopyrite
€290 / US$299 / ¥42000 / AUD$440
--------------------------------------------------
EF89L9: Acanthite with Pyrite and Calcite
€480 / US$495 / ¥69530 / AUD$740
--------------------------------------------------
TX89AN0: Acanthite with Siderite and Proustite
€4800 / US$4953 / ¥695320 / AUD$7400
--------------------------------------------------
EA56K0: Acanthite with Silver
€150 / US$154 / ¥21720 / AUD$230
--------------------------------------------------
EC48K0: Acanthite with Silver
€290 / US$299 / ¥42000 / AUD$440
--------------------------------------------------
11AT12: Acanthite, Calcite
€70 / US$72 / ¥10140 / AUD$100
--------------------------------------------------
9EF89L9: Acanthite, Pyrite, Calcite
€320 / US$330 / ¥46350 / AUD$490
--------------------------------------------------
SM75TDA: Adamite
€75 / US$77 / ¥10860 / AUD$110
--------------------------------------------------
2M14: Adamite
€90 / US$92 / ¥13030 / AUD$130
--------------------------------------------------
20MJX66: Adamite
€140 / US$144 / ¥20280 / AUD$215
--------------------------------------------------
  • Related