Im trying to use loc to get a subset of rows in a dataframe on a condition, but i want to take user input to get what this condition is and then feed that in to the loc statement to create the subset of rows.
Ive tried many ways but i dont think loc will accept its condition in a string in this format, is there a way round this?
See attempt below:
col_one = input("Please enter the condition you would like to set. E.g. State == "New York":)
user_input_test.append(col_one)
one_condition_input = self.df.loc[self.df[user_input_test],:]
# I also tried to use slice but no luck:
col_one = input("Please enter the condition you would like to set. E.g. State == "New York":)
period = slice(col_one)
self.one_condition_input = self.df.loc[period,:]
# And I tired to use format, taking two user inputs, one with column name and one with the condition, but again no luck:
col_one = input("Please enter the column you would like to set. E.g. State":)
col_two = input("Please enter the condition you would like to set. E.g. == "New York":)
one_condition_input = self.df.loc[self.df["{}".format(col_one)]"{}".format(col_two),:]
I want to be able to take user input of the whole condition and paste it like this:
col_one = input("Please enter the condition you would like to set. E.g. State == "New York":)
self.one_condition_input = self.df.loc[df.col_one,:]
But obviously here col_one is not an attribute of df so that doesnt work.
CodePudding user response:
Try pandas.DataFrame.query()
, you can pass an expression. So, you can ask the user to insert the expression and then pass it to the function.
expr = input()
df.query(expr, inplace = True)
CodePudding user response:
DataFrame.loc
property:
Access a group of rows and columns by label(s) or a boolean array.
DataFrame.iloc
property: Purely integer-location based indexing for selection by position.
actually these accept a value as a text string to index it to the corresponding column, I would advise you to use the user input but doing the conditional with these values
user_input_test.append(col_one)
one_condition_input = df.loc[df[user_input_test],:]
Instead:
user_input_test.append(col_one)
cond = re.findall(r'\w ', user_input)
col = cond[0]
col_element = " ".join(cond[1:])
one_condition_input = df.loc[df[col == col_element],:]
.
.
.
>>> user_input = "State == New York" # User input value
>>> cond = re.findall(r'\w ', user_input) # Separate strings
['State', 'New', 'York']
>>> # This is equivalent to df.loc[df["State" == "New York"]]
>>> one_condition_input = df.loc[df[col == col_element],:] # Values correspoding to columns containing "New York" state.