I have a function that searches for a term within a DataFrame and I would like to return an integer based index of the found term, I checked the docs and think that Index.get_loc() should do the trick however I am getting the following error with my code:
df = pd.read_excel(os.path.join(MEDIA_ROOT, "files", file), sheet_name=sheet)
for col in df.columns:
rowfilter = df[col].map(lambda i: str(i)).str.contains("quantity", case=False, regex=False)
search_row = df[rowfilter].index.get_loc() #This is a 'slice' of the rows containing the search term
print(search_row)
#The output should be 6
However, I am getting the following error,
Index.get_loc() missing 1 required positional argument: 'key'
I have tried the following:
search_row = df[rowfilter].index()
print(pd.Index.get_loc(search_row))
But get the same error, So the question is, what is the correct key?
CodePudding user response:
As an aside, you can simplify your search:
mask = df[col].astype(str).str.contains('quantity', case=False, regex=False)
In any case, once you obtain your mask
(which is a bool
Series
), you can get the numerical indices of all the matches:
ix = df.reset_index().index[mask]
This will be an Int64Index
. You can also get it as a list
:
ixs = ix.tolist()
In any case, you can use ix
or ixs
for .iloc
(df.iloc[ix]
).
CodePudding user response:
Since you haven't shared the sample data. creating a test dataframe
df = pd.DataFrame({
"col" : [1,2,3,4]
})
you can get the index value by
df[df.col == 1].index.values
it will return an array
array([0])
if there are multiple rows with match
df[df.col.isin([1,2,])].index.values
it returns
array([0, 1])
Hope it helps