I have this flask API in which the user can do a get request with a name they input. The thing is, I want to be able to search for that name in two different columns but I am not sure how to do that, given that this does not work since flask says 'cannot index with multidimensional key':
data = self.data.loc[self.data[['name-english','name_greek']] == name_cap].to_dict()
This is the part I'm talking about:
class Search(Resource):
def __init__(self):
self.data = pd.read_csv('datacsv')
def get(self, name):
name_cap = name.capitalize()
data = self.data.loc[self.data['name-english'] == name_cap].to_dict()
# return data found in csv
return jsonify({'message': data})
So I want to search in both those columns instead of just one.
CodePudding user response:
Seems that you have a problem in your pandasDataframe syntax, not in the Flask itself. You are probably getting this error from pandas:
ValueError: cannot index with multidimensional key
According to pandas documentation:
.loc[] is primarily label based, but may also be used with a boolean array.
Allowed inputs are:
A single label, e.g. 5 or 'a', (note that 5 is interpreted as a label of the index, and never as an integer position along the index).
A list or array of labels, e.g. ['a', 'b', 'c'].
A slice object with labels, e.g. 'a':'f'.
A boolean array of the same length as the axis being sliced, e.g. [True, False, True].
An alignable boolean Series. The index of the key will be aligned before masking.
An alignable Index. The Index of the returned selection will be the input.
A callable function with one argument (the calling Series or DataFrame) and that returns valid output for indexing (one of the above)
In you example you are giving self.data[['name-english','name_greek']] == name_cap
as a parameter to loc, this will return another dataframe, not an array of True and False or a boolean Series.
To filter your dataframe based on multiple columns you can use bitwise operators (& and | for example):
df.loc[(df["A"] == 1) | (df["B"] == 1)]
Or using the implemented method isin()
:
Whether each element in the DataFrame is contained in values.
Returns: DataFrame DataFrame of booleans showing whether each element in the DataFrame is contained in values.
Alongside with any()
:
Return whether any element is True, potentially over an axis.
Returns: Series or DataFrame If level is specified, then, DataFrame is returned; otherwise, Series is returned.
This way you'll have your boolean series to pass as parameter to you .loc, as the example:
df.loc[ df.isin([1]).any(1)]
Also, something that always helps me a lot dealing with dataframes is using jupyter to test somethings first, I think it's faster and you can mess around more in the dataframe to discover new ways to do what you need.