I have this loop that gets every combination from every row in a dataset, and create sub datasets.
df
NAME VALUE1 VALUE2
0 Alpha 100 A1
1 Alpha 100 A1
2 Alpha 200 A2
for r in range(2,len(df.columns) 1):
for cols in itertools.combinations(df.columns, r ):
print(df[list(cols)])
output:
NAME VALUE 1
0 Alpha 100
1 Alpha 100
2 Alpha 200
NAME VALUE 2
0 Alpha A1
1 Alpha A1
2 Alpha A2
VALUE 1 VALUE 2
0 100 A1
1 100 A1
2 100 A2
NAME VALUE 1 VALUE 2
0 Alpha 100 A1
1 Alpha 100 A1
2 Alpha 200 A2
I am trying to covert each row to a list or array like this:
[
['Alpha', 100],['Alpha', 100],['Alpha', 200],
['Alpha', 'A1'],['Alpha', 'A1'],['Alpha', 'A2'],
[100, 'A1'],[100, 'A1'],[200, 'A2'],
['Alpha', 100, 'A1'], ['Alpha', 200, 'A2']
]
Im thinking:
I can initially convert my df to a Numpy Array and create a better loop using bracket notation, which would eliminate the need to flatten anything. Just don't know how to do that.
or convert each row to a list or array
How can I do this?
CodePudding user response:
You can convert each row into a list like this:
res = []
import itertools
for r in range(2,len(df.columns) 1):
for cols in itertools.combinations(df.columns, r ):
res = df[list(cols)].T.to_dict('list').values()
print(res)
Output:
[['Alpha', 100], ['Alpha', 100], ['Alpha', 200], ['Alpha', 'A1'], ['Alpha', 'A1'], ['Alpha', 'A2'], [100, 'A1'], [100, 'A1'], [200, 'A2'], ['Alpha', 100, 'A1'], ['Alpha', 100, 'A1'], ['Alpha', 200, 'A2']]