In a loop, I have built a string to be used as a column restriction in a panda dataframe :
conditions = pd.DataFrame()
for index, fold in df.iterrows():
first_row = True
for elem in fold['include_months']:
if first_row:
condition = f"""('MonthNumber_""" str(elem) "' == 1)"
first_row = False
else:
condition = f""" | ('MonthNumber_""" str(elem) "' == 1)"
But I get an error with condition applied to panda column:
X_train = X_train[condition]
KeyError: "(X_train['MonthNumber_1'] == 1)| (X_train['MonthNumber_2'] == 1)"
How to fix it please?
CodePudding user response:
Make condition
not a string
condition = (X_train['MonthNumber_1'] == 1) | (X_train['MonthNumber_2'] == 1)
# since you're override your variable, it's best to make a copy
X_train = X_train[condition].copy()
You can also use query
if you want to use a string form:
condition = 'MonthNumber_1 == 1 | MonthNumber_2 == 1'
X_train = X_train.query(condition).copy()