Home > OS >  How to extract a particular element from BeautifulSoup?
How to extract a particular element from BeautifulSoup?

Time:10-26

I have written following code

    page=requests.get("http://3.85.131.173:8000/random_company") 
    soup=BeautifulSoup(page.content,"html.parser")
    info_list=soup.find_all("li")
    print(info_list)

and print gives following answer

[<li>Name: Walker, Meyer and Allen</li>, <li>CEO: David Pollard</li>, <li>CTO: Sandra Boyd</li>, <li>Address: 275 Jones Station Suite 008
Bradburgh, UT 24369</li>, <li>Investment Round: C</li>, <li>Purpose: Reduced logistical contingency for whiteboard end-to-end applications</li>]

I want to extract name and position earlier I was using indexing but it was dynamic could anyone advise how to extract name and purpose.

CodePudding user response:

Depending on which tags you want to extract, this should demonstrate a basic approach:

for info in info_list:
    name = ''
    position = ''
    if any(position in info.text.lower() for position in ['ceo', 'advisor', 'cfo']):
        position, name = info.text.split(':')
    if position != '' and name != '':
        name_position_list.append(info.text)

print(name_position_list)

example run produces:

['CEO: James Alexander', 'CFO: Lori Adams', 'Advisor: Lori Perez']

modify the terms in the list to specify text you want to extract.

CodePudding user response:

I think you're looking for something like this:

targets = ["Name","Purpose"]
for item in info_list:
    if item.text.split(":")[0] in targets:
        print(item.text)

Output (in this case):

Name: Jimenez LLC
Purpose: Mandatory context-sensitive approach for leverage compelling communities
  • Related