How do I make my dataframe single index from multindex?-CodePudding

I would like to make my data frame more aesthetically appealing and drop what I believe are the unnecessary first row and column from the multi-index. I would like the column headers to be: 'Rk', 'Team','Conf','G','Rec','ADJOE',.....,'WAB'

Any help is such appreciated.

import pandas as pd
url = 'https://www.barttorvik.com/#'
df = pd.read_html(url)
df = df[0]
df

CodePudding user response：

You only have to iterate over the existing columns and select the second value. Then you can set the list of values as new columns:

import pandas as pd

url = 'https://www.barttorvik.com/#'
df = pd.read_html(url)
df.columns = [x[1] for x in df.columns]
df.head()

Output:

    Rk  Team    Conf    G   Rec AdjOE   AdjDE   Barthag EFG%    EFGD%   ... ORB DRB FTR FTRD    2P% 2P%D    3P% 3P%D    Adj T.  WAB
0   1   Gonzaga WCC 24  22-211–0    122.42  89.05   .97491  60.21   421 ... 30.2120 2318    30.4165 21.710  62.21   41.23   37.821  29.111  73.72   4.611
1   2   Houston Amer    25  21-410–2    117.39  89.06   .95982  53.835  42.93   ... 37.26   27.6141 28.2242 33.3247 54.827  424 34.8108 29.418  65.2303 3.416

CodePudding user response：

When you read from HTML, specify the row number you want as header:

df = pd.read_html(url, header=1)[0]
print(df.head())

output:

>>
  Rk      Team  Conf   G       Rec  ...     2P%D      3P%    3P%D   Adj T.    WAB
0  1   Gonzaga   WCC  24  22-211–0  ...    41.23   37.821  29.111    73.72  4.611
1  2   Houston  Amer  25  21-410–2  ...      424  34.8108  29.418  65.2303  3.416
2  3  Kentucky   SEC  26  21-510–3  ...   46.342   35.478  29.519   68.997   4.89
3  4   Arizona   P12  25  23-213–1  ...    39.91  33.7172  31.471    72.99   6.24
4  5    Baylor   B12  26   21-59–4  ...  49.2165   35.966  30.440  68.3130   6.15