Home > database >  Returning a integer based index of a DataFrame
Returning a integer based index of a DataFrame

Time:08-08

I have a function that searches for a term within a DataFrame and I would like to return an integer based index of the found term, I checked the docs and think that Index.get_loc() should do the trick however I am getting the following error with my code:

df = pd.read_excel(os.path.join(MEDIA_ROOT, "files", file), sheet_name=sheet)
for col in df.columns:
    rowfilter = df[col].map(lambda i: str(i)).str.contains("quantity", case=False, regex=False)
    search_row = df[rowfilter].index.get_loc() #This is a 'slice' of the rows containing the search term
    print(search_row)
    #The output should be 6

However, I am getting the following error,

Index.get_loc() missing 1 required positional argument: 'key'

I have tried the following:

search_row = df[rowfilter].index()
print(pd.Index.get_loc(search_row))

But get the same error, So the question is, what is the correct key?

CodePudding user response:

As an aside, you can simplify your search:

mask = df[col].astype(str).str.contains('quantity', case=False, regex=False)

In any case, once you obtain your mask (which is a bool Series), you can get the numerical indices of all the matches:

ix = df.reset_index().index[mask]

This will be an Int64Index. You can also get it as a list:

ixs = ix.tolist()

In any case, you can use ix or ixs for .iloc (df.iloc[ix]).

CodePudding user response:

Since you haven't shared the sample data. creating a test dataframe

df = pd.DataFrame({
    "col" : [1,2,3,4]
})

you can get the index value by

df[df.col == 1].index.values

it will return an array

array([0])

if there are multiple rows with match

df[df.col.isin([1,2,])].index.values

it returns

array([0, 1])

Hope it helps

  • Related