I'm creating a python script that allows me to automate one of my repetitive tasks.
I create automatic expert advisors to invest in the financial markets to which I have to periodically check the results and bring them to an excel, this obviously allows a lot of human error.
I would like to automate the export of the results, and I was able to make the script log into my online account, once I got to the table I found a way to export all the results, the problem is that I cannot report them again in the form of a table ...
If I try to write them on an excel, it writes them all to me in one column. A tidy way would be the dictionary, but I don't know how to go about that from the web table. I am attaching my code, the site in question is https://www.fxblue.com/
from selenium.webdriver import Chrome
from webdriver_manager.chrome import ChromeDriverManager
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.common.by import By
from time import sleep
import pandas as pd
import csv
# Variabili
Username = ''
Password = ''
TableValue = []
# Link sito web da cui iniziare scraping
FxBlueLink = 'https://www.fxblue.com/live/'
# Installo i driver google con il modulo webdriver_manager.chrome
ChromeDriver = ChromeDriverManager().install()
# Avvio il browser
Browser = Chrome(service=Service(ChromeDriver))
# Ingrandisco a tutto schermo chrome
Browser.maximize_window()
# Vado al link
Browser.get(FxBlueLink)
# Trovo input Username e Password
Browser.find_element(By.XPATH,
"/html/body/div[1]/div[5]/div/div[2]/div/div[2]/div[1]/div[3]/form/table/tbody/tr[1]/td/input").send_keys(Username)
Browser.find_element(By.XPATH,
"/html/body/div[1]/div[5]/div/div[2]/div/div[2]/div[1]/div[3]/form/table/tbody/tr[2]/td/input").send_keys(Password)
sleep(2)
# Click pulsante di Login
Browser.find_element(By.XPATH,
"/html/body/div[1]/div[5]/div/div[2]/div/div[2]/div[1]/div[3]/form/table/tbody/tr[3]/td/input").click()
sleep(2)
# Vado sulle statistiche
Browser.get("https://www.fxblue.com/users/rdrtrader")
sleep(2)
# Vado su Analisi
Browser.find_element(By.XPATH,
"/html/body/div[1]/div[5]/div/div[2]/div/div[1]/div/div[1]/div[1]/ul/li[2]").click()
sleep(2)
# Vado su Menu Simbolo e scelgo magic number
Browser.find_element(By.ID, "ext-gen24").click()
sleep(2)
Browser.find_element(By.ID, "ext-gen71").click()
sleep(2)
# Scraping tabella
Table = Browser.find_element(By.CLASS_NAME, "x-grid3-body").text
TableValue = Table
print(TableValue)
sleep(1)
input()
CodePudding user response:
I had a look at that long html file. Supposing you are able to access it with Selenium, all you have to do is:
import pandas as pd
import time as t
[... rest of imports, and code to get to that page]
t.sleep(5) ## this is just for good measure, you could in theory skip it
dfs = pd.read_html(driver.page_source)
for df in dfs:
print(df)
Pandas recognizes 36 tables in the html you provided. Locate the tables you need, and then do df.to_csv('relevant_name.csv') for each of them. Or filter them based on keywords/content.
Relevant documentation for pandas: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_html.html
CodePudding user response:
@platipus_on_fire thanks to your help I was able to find the tables. With match = # I was able to get only the ones I need, and with a print they are written right in the console. When I go to insert them in the csv, he instead writes only the last table and put all in a single line.
Sorry if I ask questions maybe from noob but this is my second python program
from selenium.webdriver import Chrome
from webdriver_manager.chrome import ChromeDriverManager
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.common.by import By
from time import sleep
import pandas as pd
import csv
# Variabili
Username = 'rdrtrader'
Password = '86tsF0&E!cU!'
# Link sito web da cui iniziare scraping
FxBlueLink = 'https://www.fxblue.com/live/'
# Installo i driver googgle con il modulo webdriver_manager.chrome
ChromeDriver = ChromeDriverManager().install()
# Avvio il browser
Browser = Chrome(service=Service(ChromeDriver))
# Ingrandisco a tutto schermo chrome
Browser.maximize_window()
# Vado al link
Browser.get(FxBlueLink)
# Trovo input Username e Password
Browser.find_element(By.XPATH,
"/html/body/div[1]/div[5]/div/div[2]/div/div[2]/div[1]/div[3]/form/table/tbody/tr[1]/td/input").send_keys(Username)
Browser.find_element(By.XPATH,
"/html/body/div[1]/div[5]/div/div[2]/div/div[2]/div[1]/div[3]/form/table/tbody/tr[2]/td/input").send_keys(Password)
sleep(2)
# Click pulsante di Login
Browser.find_element(By.XPATH,
"/html/body/div[1]/div[5]/div/div[2]/div/div[2]/div[1]/div[3]/form/table/tbody/tr[3]/td/input").click()
sleep(2)
# Vado sulle statistiche
Browser.get("https://www.fxblue.com/users/rdrtrader")
sleep(2)
# Vado su Analisi
Browser.find_element(By.XPATH,
"/html/body/div[1]/div[5]/div/div[2]/div/div[1]/div/div[1]/div[1]/ul/li[2]").click()
sleep(2)
# Vado su Menu Simbolo e scelgo magic number
Browser.find_element(By.ID, "ext-gen24").click()
sleep(2)
Browser.find_element(By.ID, "ext-gen71").click()
sleep(2)
# Trovo le tabelle nella pagina
TableValue = pd.read_html(Browser.page_source, match = '#')
for table in TableValue:
print(table)
table.to_csv('table.csv')