How to convert values from two different lists into a single dictionary?-CodePudding

I am scraping financial summary from https://www.investing.com/equities/nvidia-corp-financial-summary. Code:

To get the ratio descriptions:

for element in soup.find_all('span', attrs={'class': 'float_lang_base_1'}):
    print(element)

The code will result in:

<span >Gross margin</span>
<span >Operating margin</span>
<span >Net Profit margin</span>
<span >Return on Investment</span>
<span >Quick Ratio</span>
<span >Current Ratio</span>
<span >LT Debt to Equity</span>
<span >Total Debt to Equity</span>
<span >Cash Flow/Share</span>
<span >Revenue/Share</span>
<span >Operating Cash Flow</span>

To get the values for each of ratio above:

for element in soup.find_all('span', attrs={'class': 'float_lang_base_2 text_align_lang_base_2 dirLtr bold'}):
    a = element.get_text()

results in:

 60.45%
 31.47%
 26.03%
 22.86%
 2.95
 3.62
 -
 49.02%
 -
 -
 16.77%

Now, I need to match the two, so that it will be a key value pair that can be transformed into a dataframe.

Gross margin : 60.45%
Operating margin: 31.47%
Net Profit margin: 26.03% 
...

CodePudding user response：

You can find main div tag which has both the values and iterate over that to identify other properties using class and append to dict1

dict1={}
for element in soup.find_all('div', attrs={'class': 'infoLine'}):
    name=element.find("span",class_="float_lang_base_1").get_text()
    value=element.find("span",class_="float_lang_base_2").get_text()
    dict1[name]=value

Here you can use pandas to create df and transform dict1 to table form data

import pandas as pd
df=pd.DataFrame(dict1.items(),columns=['A','B'])
df

Output:

       A                  B
0   Gross margin        60.45%
1   Operating margin    31.47%
.....

CodePudding user response：

You can Get values from two different lists into a single dictionary

Mykeys = ["a", "b", "c"]
Myvalues = [1, 3, 5]
print ("Mykey list: "   str(Mykeys))
print ("Myvalue list: "   str(Myvalues))
res = dict(zip(Mykeys, Myvalues))

print ("New dictionary will be : "    str(res))

CodePudding user response：

As mentioned in the answers you could zip() your lists and transform into dict().

Anyway there is an altrnative approach in selecting and extracting the information from the elements:

dict(list(row.stripped_strings)[::len(list(row.stripped_strings))-1] for row in soup.select('.infoLine'))

This one will select() or find_all() elements with class infoLine what is the container tag of the <span>s. While .stripped_strings extract the texts as a ResultSet we only have to list slice the first and the last element and convert it in dict comprehension to the final result.

Be aware: Zipping lists or using lists at all you have to ensure, that they will have the same length, else you will get an error concerning this missmatch.

Example

import requests
from bs4 import BeautifulSoup
  
url='https://www.investing.com/equities/nvidia-corp-financial-summary'
soup = BeautifulSoup(requests.get(url, headers = {'User-Agent': 'Mozilla/5.0'}).text)
dict(list(row.stripped_strings)[::len(list(row.stripped_strings))-1] for row in soup.select('.infoLine'))

Output

{'Gross margin': '60.45%',
 'Operating margin': '31.47%',
 'Net Profit margin': '26.03%',
 'Return on Investment': '22.86%',
 'Quick Ratio': '2.95',
 'Current Ratio': '3.62',
 'LT Debt to Equity': '-',
 'Total Debt to Equity': '49.02%',
 'Cash Flow/Share': '-',
 'Revenue/Share': '-',
 'Operating Cash Flow': '16.77%'}