Not standart excel to pandas data frame-CodePudding

I ve got not standart excel table with the help of openpyxl. I've done some part on the way to convert it to pandas dataframe. But now I'm stucked with problem. I want to select just range of columns rows and get data from them. Like take cells from 4 to 12 row, and column from j to x. I hope you understand me. Sorry for my english.

CodePudding user response：

You can try something like that:

df = pd.read_excel('data.xlsx', skiprows=4, usecols=['J:X'], nrows=9)

If the number of rows is not fixed, you can use your second column as delimiter.

df = pd.read_excel('data.xlsx', skiprows=4, usecols=['J:X'])
df = df[df.iloc[:, 1].notna()]

CodePudding user response：

you could skip the rows as you read the excel file to a Dataframe and initially drop the first 4 rows and then manipulate the Dataframe as follows.

first line is reading the file by skipping the first 4 rows
second line is dropping a range of rows from the dataframe (startRow and endRow being the integer values of the row index)
third line is dropping 2 columns from the dataframe

df = pd.read_excel('fileName.xlsx', skiprows=4)
df.drop([startRow, endRow], inplace=True)
df.drop(['column1', 'column2'], axis=1)