Home > Mobile >  Export data from a web table to csv
Export data from a web table to csv

Time:08-14

I'm creating a python script that allows me to automate one of my repetitive tasks.

I create automatic expert advisors to invest in the financial markets to which I have to periodically check the results and bring them to an excel, this obviously allows a lot of human error.

I would like to automate the export of the results, and I was able to make the script log into my online account, once I got to the table I found a way to export all the results, the problem is that I cannot report them again in the form of a table ...

If I try to write them on an excel, it writes them all to me in one column. A tidy way would be the dictionary, but I don't know how to go about that from the web table. I am attaching my code, the site in question is https://www.fxblue.com/

from selenium.webdriver import Chrome
from webdriver_manager.chrome import ChromeDriverManager
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.common.by import By
from time import sleep
import pandas as pd
import csv

# Variabili 
Username = ''
Password = ''
TableValue = []

# Link sito web da cui iniziare scraping
FxBlueLink = 'https://www.fxblue.com/live/'

# Installo i driver google con il modulo webdriver_manager.chrome
ChromeDriver = ChromeDriverManager().install()

# Avvio il browser
Browser = Chrome(service=Service(ChromeDriver))

# Ingrandisco a tutto schermo chrome
Browser.maximize_window()

# Vado al link 
Browser.get(FxBlueLink)

# Trovo input Username e Password
Browser.find_element(By.XPATH, 
"/html/body/div[1]/div[5]/div/div[2]/div/div[2]/div[1]/div[3]/form/table/tbody/tr[1]/td/input").send_keys(Username)

Browser.find_element(By.XPATH, 
"/html/body/div[1]/div[5]/div/div[2]/div/div[2]/div[1]/div[3]/form/table/tbody/tr[2]/td/input").send_keys(Password)

sleep(2)

# Click pulsante di Login
Browser.find_element(By.XPATH, 
"/html/body/div[1]/div[5]/div/div[2]/div/div[2]/div[1]/div[3]/form/table/tbody/tr[3]/td/input").click()

sleep(2)

# Vado sulle statistiche
Browser.get("https://www.fxblue.com/users/rdrtrader")

sleep(2)

# Vado su Analisi
Browser.find_element(By.XPATH, 
"/html/body/div[1]/div[5]/div/div[2]/div/div[1]/div/div[1]/div[1]/ul/li[2]").click()

sleep(2)

# Vado su Menu Simbolo e scelgo magic number
Browser.find_element(By.ID, "ext-gen24").click()

sleep(2)

Browser.find_element(By.ID, "ext-gen71").click()

sleep(2)

# Scraping tabella

Table =  Browser.find_element(By.CLASS_NAME, "x-grid3-body").text

TableValue = Table

print(TableValue)   

sleep(1)
    
input()

CodePudding user response:

I had a look at that long html file. Supposing you are able to access it with Selenium, all you have to do is:

import pandas as pd
import time as t
[... rest of imports, and code to get to that page]

t.sleep(5) ## this is just for good measure, you could in theory  skip it
dfs = pd.read_html(driver.page_source)
for df in dfs:
print(df)

Pandas recognizes 36 tables in the html you provided. Locate the tables you need, and then do df.to_csv('relevant_name.csv') for each of them. Or filter them based on keywords/content.

Relevant documentation for pandas: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_html.html

CodePudding user response:

@platipus_on_fire thanks to your help I was able to find the tables. With match = # I was able to get only the ones I need, and with a print they are written right in the console. When I go to insert them in the csv, he instead writes only the last table and put all in a single line.

Sorry if I ask questions maybe from noob but this is my second python program

from selenium.webdriver import Chrome
from webdriver_manager.chrome import ChromeDriverManager
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.common.by import By
from time import sleep
import pandas as pd
import csv

# Variabili 
Username = 'rdrtrader'
Password = '86tsF0&E!cU!'

# Link sito web da cui iniziare scraping
FxBlueLink = 'https://www.fxblue.com/live/'

# Installo i driver googgle con il modulo webdriver_manager.chrome
ChromeDriver = ChromeDriverManager().install()

# Avvio il browser
Browser = Chrome(service=Service(ChromeDriver))

# Ingrandisco a tutto schermo chrome
Browser.maximize_window()

# Vado al link 
Browser.get(FxBlueLink)

# Trovo input Username e Password
Browser.find_element(By.XPATH, 
"/html/body/div[1]/div[5]/div/div[2]/div/div[2]/div[1]/div[3]/form/table/tbody/tr[1]/td/input").send_keys(Username)

Browser.find_element(By.XPATH, 
"/html/body/div[1]/div[5]/div/div[2]/div/div[2]/div[1]/div[3]/form/table/tbody/tr[2]/td/input").send_keys(Password)

sleep(2)

# Click pulsante di Login
Browser.find_element(By.XPATH, 
"/html/body/div[1]/div[5]/div/div[2]/div/div[2]/div[1]/div[3]/form/table/tbody/tr[3]/td/input").click()

sleep(2)

# Vado sulle statistiche
Browser.get("https://www.fxblue.com/users/rdrtrader")

sleep(2)

# Vado su Analisi
Browser.find_element(By.XPATH, 
"/html/body/div[1]/div[5]/div/div[2]/div/div[1]/div/div[1]/div[1]/ul/li[2]").click()

sleep(2)

# Vado su Menu Simbolo e scelgo magic number
Browser.find_element(By.ID, "ext-gen24").click()

sleep(2)

Browser.find_element(By.ID, "ext-gen71").click()

sleep(2)

# Trovo le tabelle nella pagina

TableValue = pd.read_html(Browser.page_source, match = '#')
for table in TableValue:
    print(table)
    table.to_csv('table.csv')
  • Related