Algorithm for computing all boolean combination DataFrame, with

I have a DataFrame which contains a few thousand columns with tens of thousand of rows (the amount of rows I would like to increase later). All the values in the DataFrame are Boolean.

import numpy as np
import pandas as pd

column_amount = 6 # ambigue example value
row_amount = 10 # ambigue example value
df = pd.DataFrame([np.random.choice([True, False], column_amount) for i in range(row_amount)])
df

    0       1       2       3       4       5
0   False   False   True    False   False   False
1   True    True    True    True    False   False
2   True    False   True    False   False   False
3   True    True    True    True    True    True
4   True    False   True    False   False   False
5   True    False   False   True    False   False
6   True    True    True    True    False   True
7   False   True    True    True    True    True
8   True    False   True    True    False   False
9   True    True    False   False   True    True

All columns stand for signals at all available timestamps, every column therefore has a reward value ranging from 0 to 1000 (for example 453). The reward value is calculated by a function which does not matter for this part of the problem.

def get_reward(column):
    reward = ...(column)
    return reward

That reward value is what I would like to optimise by a combination of columns. These columns can be merged by using AND or OR expressions and a specific order of operations, obtaining a new reward value.

Lets say the reward values for columns 0 1 2 are 100 200 and 300 respectively. Combining column 0 and 1 with 'OR', would give an ambigue reward value of for example 250. But combining column 0 'OR' column 2 makes for a worse combined reward value of 80. The columns don't have any relation with eachother and are completely independent (the columns for the sake of argument could be shuffled).

# An example solution of combinations
combination_column = (df[5] | df[1]) & ((df[4] | df[2]) & df[3])
reward = get_reward(combination_column) # example reward value of 3000

I need to find the most optimal combination_column (highest reward metric) with an efficient algorithm, otherwise the search space could get very big very fast. The solution will probably be some form of combination of tens of different columns using & or |. The problem is that a greedy algorithm only exploring the columns with the very highest reward metrics could very much not be the best solution. Because there could be a combination of let's say two columns with all 100 reward value, which together get a reward value of thousands. But I would think the higher the reward value, the higher the chance of getting a good combination reward value. So taken this into account I would prefer if all possible combinations would be calculated but with an efficient order. That way I could implement an early stop function myself to abort the search space if it goes out of hand and doesn't look that promising anymore (it hopefully found the best solution by then).

from itertools import permutations

Permutations could be helpful to make all possible combinations of all the columns. But I am stuck how to efficiently go through the search space, combined with the possibility of 'AND' 'OR' and brackets for order of operations. So I am hoping a genius on the internet sees this and would help me with this problem. If anything is not clear I would greatly elaborate further, but I wanted to keep the problem as consise as possible.

CodePudding user response：

I have come up with a greedy solution that is acceptable for my use case. Just in case someone comes across a problem alike, maybe this might help you:

"""
Sort all columns by reward_values
Make all combinations of 2 columns and add to dataframe, if satisfies given rules
Repeat until a full trial doesn't add a new column

RULES: Don't try combinations in which a starting column exists on both sides
RULES: Don't include columns with a negative training reward_value
RULES: Don't include equal columns
RULES: Don't include a new column if the new training reward_value is not better than the max of the two originals
"""