Home > Back-end >  Pandas - How To Clean Up Scrape
Pandas - How To Clean Up Scrape

Time:02-21

My goal is to access a clinical trials page, and pull the last row of a given table.

My current code, when pulling this last row, pulls more information than needed. (See attached)

I would like for only the date to pull (Highlighted in green).

import pandas as pd
import time
from selenium import webdriver
driver = webdriver.Chrome()

url='https://clinicaltrials.gov/ct2/show/NCT03328858?cond=brain tumor&draw=2&rank=4'
driver.get(url)
time.sleep(1)

df=pd.read_html(url)[3] 
df3=df.iloc[-1]
print(df3)

enter image description here

CodePudding user response:

So if you like to get the last value of last series you can use .iloc[] method this way:

df.iloc[-1,-1]

or by series name if you will know it or sure it will be 'Unnamed: 1':

df['Unnamed: 1'].iloc[-1]

Will give you:

January 31, 2020
  • Related