I'm read a table from website using:
df = pd.read_html('website link')
df[0]
It successfully read the table but I want to replace the 1st row as the header. I'm using this code:
df.columns = df.iloc[0]
df = df[1:]
df.head()
but it gave me an error that said:
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-11-f9b2cba2eb0b> in <module>
----> 1 df.columns = df.iloc[0] #grab the first row for the header
2 df = df[1:] #take the data less the header row
3 df
AttributeError: 'list' object has no attribute 'iloc'
CodePudding user response:
.read_html returns a list. It can be a list of dfs and you would want to .concat them first:
dfs = pd.read_html(url)
df = pd.concat(dfs)
or a list of 1 df if only one table so you would want to index the list:
dfs = pd.read_html(url)
df = dfs[0]
And finally replace headers with first row:
df = df.rename(columns=df.iloc[0]).drop(df.index[0])
CodePudding user response:
Try this:
df = pd.read_html('https://www.w3schools.com/python/python_ml_decision_tree.asp')
df[0].columns = df[0].iloc[0]
df = df[0][1:]