Home > front end >  How to assign string condition to a dataframe column condition?
How to assign string condition to a dataframe column condition?

Time:06-12

In a loop, I have built a string to be used as a column restriction in a panda dataframe :

conditions = pd.DataFrame()

for index, fold in df.iterrows():
    first_row = True
    for elem in fold['include_months']:
        if first_row: 
            condition = f"""('MonthNumber_""" str(elem) "' == 1)" 
            first_row = False
        else:
            condition  = f""" | ('MonthNumber_""" str(elem) "' == 1)" 
        

But I get an error with condition applied to panda column:

X_train = X_train[condition]

KeyError: "(X_train['MonthNumber_1'] == 1)| (X_train['MonthNumber_2'] == 1)"

How to fix it please?

CodePudding user response:

Make condition not a string

condition = (X_train['MonthNumber_1'] == 1) | (X_train['MonthNumber_2'] == 1)

# since you're override your variable, it's best to make a copy
X_train = X_train[condition].copy()

You can also use query if you want to use a string form:

condition = 'MonthNumber_1 == 1 | MonthNumber_2 == 1'
X_train = X_train.query(condition).copy()
  • Related