Home > Enterprise >  reading excel dataframe starting from specific location based on the cell value
reading excel dataframe starting from specific location based on the cell value

Time:10-19

Say I have the following Excel file:

A B C
1 11 111
2 22 start
3 33 333
4 44 444

I want to read the file in a dataframe making sure that I start to read it above the row where the Start value is. I tried the following:

df=df[(df.iloc[:, 2] == "start").shift(-1,fill_value=False).cummax()]

However, not only the row where start is located changes but also the col between the sheet in the workbook. Is there a way where I can look up the value I need to start with by using the col and row location?

that is what I want to have:

C
111
start
333
444

Thanks!

CodePudding user response:

Try:

#index of first row above "start"
row_index = df[df.shift(-1).eq("start").any(axis=1)].index.min()

#name of first column before the column that contains "start"
col_index = df.loc[:,df.shift(-1, axis=1).eq("start").any(0)].columns[0]

#select all rows and columns per the above indices.
>>> df.loc[row_index:, col_index:]

    B      C
0  11    111
1  22  start
2  33    333
3  44    444
  • Related