I can't understand why i get the error "IndexError: index 159 is out of bounds for axis 0 with size 159" while dropping a list of rows from a dataframe.
#Import file Excel
xls = pd.ExcelFile(file_path)
#Parse away the first 5 rows
df = xls.parse('Daten', skiprows=5, index_col=None, na_values=['NA'])
# Select row where value in column "Punktrolle_SO" is not 'UK_Schwelle_Wehr_Blockrampe'
row_numbers = [x 1 for x in df[df['Punktrolle_SO'] != 'UK_Schwelle_Wehr_Blockrampe'].index]
#Changing the index to skip the index 0
df.index = df.index 1
#Dropping the rows where the data are not 'UK_Schwelle_Wehr_Blockrampe'
dataframe = df.drop(df.index[row_numbers], inplace=True)
The list row_numbers contains the correct 156 values and the dataframe index goes from 1 to 159 so why do I get an IndexError?
runfile('O:/GIS/GEP/Risikomanagement/Flussvermessung/ALD/Analyses/ReadMultileFilesInOne.py', wdir='O:/GIS/GEP/Risikomanagement/Flussvermessung/ALD/Analyses')
Traceback (most recent call last):
File "O:\GIS\GEP\Risikomanagement\Flussvermessung\ALD\Analyses\ReadMultileFilesInOne.py", line 73, in <module>
dataframe = df.drop(df.index[row_numbers], inplace=True)
File "C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\indexes\range.py", line 708, in __getitem__
return super().__getitem__(key)
File "C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\indexes\base.py", line 3941, in __getitem__
result = getitem(key)
IndexError: index 159 is out of bounds for axis 0 with size 159
Can anyone help me to see what I am doing worng?
Thank you,
Davide
I expect a dataframe containig the rows of the Excel file where the value in the column "Punktrolle_SO" is equal to 'UK_Schwelle_Wehr_Blockrampe'.
CodePudding user response:
Isn't it better to just keep the rows containing UK_Schwelle_Wehr_Blockrampe
using?:
df[df["Punktrolle_SO"].str.contains("UK_Schwelle_Wehr_Blockrampe")]
CodePudding user response:
If the dataframe has a size of 159, then the highest index is 158. This is because the indicies start at 0 instead of 1. You are trying to access an index one higher than the maximum.
The dataframe does not go from 1 to 159 - it goes from 0 to 158. Thus index 159 will be out of bounds. You need to offset your accesses by 1.