Home > Blockchain >  How to create a dictionary variable with one line?
How to create a dictionary variable with one line?

Time:01-26

Any comments or solutions are welcomed; I could not create a dictionary variable with one line.

import requests as re
from bs4 import BeautifulSoup


url = re.get('https://toiguru.jp/toeic-vocabulary-list')
soup = BeautifulSoup(url.content, "html.parser")
words = [str(el).replace("<td>", "") for el in soup.find_all("td")]
words = [str(el).replace("</td>", "") for el in words]
**words = [str(el).split("<br/>")for el in words]**

# With this code below, it got an error saying "IndexError: list index out of range"
words = {str(el[0]):str(el[1])for el in words}

# From here, I could not have any idea to create a dictionary variable like below
#{ENword: translation for ENword} 
# e.g.) {'survey':'調査'}, {'interview':'面接'}

words = [str(el).split("<br/>")for el in words]

*The code above outputs values as below:

[['survey', '調査'], ['interview', '面接'], ['exhibition', '展示'], ['conference', '会議'], ['available', '利用できる'], ['annual', '年
1回の'], ['equipment', '備品/器具'], ['department', '部署'], ['refund', '払い戻す'], ['receipt', '領収書'], ['schedule', '予定, 計画'], ・・・and more・・・]

I want to change the above-mentioned values like this:

{ENword: translation for ENword} 
e.g.) {'survey':'調査'}, {'interview':'面接'}

With bs4, I want to create a dictionary variable.

CodePudding user response:

Try the code below. There seems to be atleast 1 element in words that has no 2 items

words = {el[0]:el[1] for el in words if len(el)==2}

to find the non valid elements with different formatation u can use:

not_good=[[f"index={counter}", f"value={el2}"] for counter,el2 in enumerate(words) if len(el2)!=2]
print(not_good)
#output [['index=474', "value=['neither', 'どちらも…でない', '']"], ['index=475', "value=['']"], ['index=481', "value=['enclose', '同封する', '']"], ['index=701', "value=['']"]]

CodePudding user response:

Ignore ['']:

words = {el[0]: el[1] for el in words if el != ['']}

# {'survey': '調査', 'interview': '面接', ..., 'neither': 'どちらも…でない', ..., 'enclose': '同封する', ...}

or list of dict:

words = [{el[0]: el[1]} for el in words if el != ['']]

# [{'survey': '調査'}, {'interview': '面接'}, ..., {'neither': 'どちらも…でない'}, ..., {'enclose': '同封する'}, ...]
  • Related