s = Service(executable_path=r'D:\Python3104\chromedriver.exe')
driver = webdriver.Chrome(service=s)
driver.maximize_window()
url = '''http://racing.hkjc.com/racing/information/Chinese/Reports/CORunning.aspx?
Date=20220701&RaceNo=2'''
driver.get(url)
time.sleep(3)
#Got the RaceNo from URL
soup = BeautifulSoup(driver.page_source, "html.parser")
RACENo = ((driver.current_url.split("RaceNo=")[1]))
#Change the RaceNO into Series
RACENo = pd.Series(RACENo)
#Get a DataFrame From HTML
df = pd.read_html(
str(soup.find("table", class_="table_bd f_fs13")))
#Add a column to DataFrame using RACEno
df = df[:,"RaceNo": RACENo]
But it told me: DataFrame constructor not properly called! Can anyone tell me What i did wrong?
CodePudding user response:
The following code below should work. You just need to assign the RACENo
as str
(not pandas.Series
) to the new column. I.e. there is no need to convert RACENo
to a pandas.Series
.
There are some ways to do it:
df['RaceNo'] = RACENo
- Insert/replace a column with the given value to all rows;df = df.assign(RaceNo=RACENo)
-assign
a new column to the DataFramedf.insert(0, 'RaceNo', RACENo)
- Insert a column at a specified positiondf.loc[:, 'RaceNo'] = RACENo
- Create a new column and assign the new value
import time
from numpy import NaN
import pandas as pd
from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from bs4 import BeautifulSoup
s = Service(executable_path=r'D:\Python3104\chromedriver.exe')
driver = webdriver.Chrome(service=s)
driver.maximize_window()
url = '''http://racing.hkjc.com/racing/information/Chinese/Reports/CORunning.aspx?
Date=20220701&RaceNo=2'''
driver.get(url)
time.sleep(10)
#Got the RaceNo from URL
soup = BeautifulSoup(driver.page_source, "html.parser")
RACENo = ((driver.current_url.split("RaceNo=")[1]))
#Get a DataFrame From HTML
df = pd.read_html(
str(soup.find("table", class_="table_bd f_fs13")))[0]
#Add a column to DataFrame using RACEno
df['RaceNo'] = RACENo
CodePudding user response:
To insert a new column to the existing dataframe you need to assign the values of your Series in that way:
df['RaceNo'] = RACENo.values