Attempting to make a Complex For Loop in Python-CodePudding

I have this dataset:

     NAME       VALUE1  VALUE2  
0    Alpha      100     A1
1    Alpha      100     A1
2    Alpha      200     A2

I would like a for loop that can loop through each individual value and make every possible solution, without repeating, and then push each pair into an array, like so:

[
[ALPHA, 100]
[ALHA, 100]
[ALPHA, 200]

[ALPHA, A1] 
[ALPHA, A1]
[ALPHA, A2]

[100, A1]
[100, A1]
[200, A2]

[ALPHA, 100, A1]
[ALPHA, 100, A1]
[ALPHA, 200, A2]
]

I currently have this which gives me my desired result, but if I was to consume a larger dataset, I'm thinking there must be an easier way to automate the process.

A = []
for index, row in df.iterrows():
    A.append([df.iloc[index,0], df.iloc[index,1]])
    
for index, row in df.iterrows():
    A.append([df.iloc[index,0], df.iloc[index,2]])
    
for index, row in df.iterrows():
    A.append([df.iloc[index,1], df.iloc[index,2]])
    
for index, row in df.iterrows():
    A.append([df.iloc[index,0], df.iloc[index, 1], df.iloc[index,2]])

CodePudding user response：

This is easier without converting to pandas first, but this will do it:

import pandas as pd
import itertools

data = [
    ['Alpha', 100, 'A1'],
    ['Alpha', 100, 'A1'],
    ['Alpha', 200, 'A2']
]
df = pd.DataFrame(data, columns=['NAME','VALUE1','VALUE2'])
print(df)

# Extract the columns.
c1 = set(df['NAME'].tolist())
c2 = set(df['VALUE1'].tolist())
c3 = set(df['VALUE2'].tolist())

for z in itertools.product(c1,c2):
    print(z)
for z in itertools.product(c1,c3):
    print(z)
for z in itertools.product(c2,c3):
    print(z)
for z in itertools.product(c1,c2,c3):
    print(z)

Output:

    NAME  VALUE1 VALUE2
0  Alpha     100     A1
1  Alpha     100     A1
2  Alpha     200     A2
('Alpha', 200)
('Alpha', 100)
('Alpha', 'A1')
('Alpha', 'A2')
(200, 'A1')
(200, 'A2')
(100, 'A1')
(100, 'A2')
('Alpha', 200, 'A1')
('Alpha', 200, 'A2')
('Alpha', 100, 'A1')
('Alpha', 100, 'A2')

itertools.product is relatively recent. You can actually do the two-level loops using itertools.permutations instead of three different loops.

FOLLOWUP

Here's how you generalize it to more than 3 columns. Basically, in a loop, you do 2 at a time, then 3 at a time, etc. itertools.combinations can return all the combinations we need.

import itertools
import pandas as pd

data = [
    ['Alpha', 100, 'A1', 999],
    ['Alpha', 100, 'A1', 888],
    ['Alpha', 200, 'A2', 777]
]

columns=['NAME','VALUE1','VALUE2','VALUE3']
df = pd.DataFrame(data, columns=columns)
print(df)

for r in range(2,len(columns) 1):
    for cols in itertools.combinations( columns, r ):
        cdata = [set(df[c].tolist()) for c in cols]
        for z in itertools.product(*cdata):
            print(z)

Output:

    NAME  VALUE1 VALUE2  VALUE3
0  Alpha     100     A1     999
1  Alpha     100     A1     888
2  Alpha     200     A2     777
('Alpha', 200)
('Alpha', 100)
('Alpha', 'A1')
('Alpha', 'A2')
('Alpha', 888)
('Alpha', 777)
('Alpha', 999)
(200, 'A1')
(200, 'A2')
(100, 'A1')
(100, 'A2')
(200, 888)
(200, 777)
(200, 999)
(100, 888)
(100, 777)
(100, 999)
('A1', 888)
('A1', 777)
('A1', 999)
('A2', 888)
('A2', 777)
('A2', 999)
('Alpha', 200, 'A1')
('Alpha', 200, 'A2')
('Alpha', 100, 'A1')
('Alpha', 100, 'A2')
('Alpha', 200, 888)
('Alpha', 200, 777)
('Alpha', 200, 999)
('Alpha', 100, 888)
('Alpha', 100, 777)
('Alpha', 100, 999)
('Alpha', 'A1', 888)
('Alpha', 'A1', 777)
('Alpha', 'A1', 999)
('Alpha', 'A2', 888)
('Alpha', 'A2', 777)
('Alpha', 'A2', 999)
(200, 'A1', 888)
(200, 'A1', 777)
(200, 'A1', 999)
(200, 'A2', 888)
(200, 'A2', 777)
(200, 'A2', 999)
(100, 'A1', 888)
(100, 'A1', 777)
(100, 'A1', 999)
(100, 'A2', 888)
(100, 'A2', 777)
(100, 'A2', 999)
('Alpha', 200, 'A1', 888)
('Alpha', 200, 'A1', 777)
('Alpha', 200, 'A1', 999)
('Alpha', 200, 'A2', 888)
('Alpha', 200, 'A2', 777)
('Alpha', 200, 'A2', 999)
('Alpha', 100, 'A1', 888)
('Alpha', 100, 'A1', 777)
('Alpha', 100, 'A1', 999)
('Alpha', 100, 'A2', 888)
('Alpha', 100, 'A2', 777)
('Alpha', 100, 'A2', 999)

FOLLOWUP #2

This creates "sub dataframes" with every combination of columns. I'll leave it as an exercise for the reader to convert those to lists of rows.

import itertools
import pandas as pd

data = [
    ['Alpha', 100, 'A1',999],
    ['Alpha', 100, 'A1',888],
    ['Alpha', 200, 'A2',777]
]

columns=['NAME','VALUE1','VALUE2','VALUE3']
df = pd.DataFrame(data, columns=columns)
print(df)

for r in range(2,len(columns) 1):
    for cols in itertools.combinations( columns, r ):
        print(df[list(cols)])

Output:

    NAME  VALUE1 VALUE2  VALUE3
0  Alpha     100     A1     999
1  Alpha     100     A1     888
2  Alpha     200     A2     777
    NAME  VALUE1
0  Alpha     100
1  Alpha     100
2  Alpha     200
    NAME VALUE2
0  Alpha     A1
1  Alpha     A1
2  Alpha     A2
    NAME  VALUE3
0  Alpha     999
1  Alpha     888
2  Alpha     777
   VALUE1 VALUE2
0     100     A1
1     100     A1
2     200     A2
   VALUE1  VALUE3
0     100     999
1     100     888
2     200     777
  VALUE2  VALUE3
0     A1     999
1     A1     888
2     A2     777
    NAME  VALUE1 VALUE2
0  Alpha     100     A1
1  Alpha     100     A1
2  Alpha     200     A2
    NAME  VALUE1  VALUE3
0  Alpha     100     999
1  Alpha     100     888
2  Alpha     200     777
    NAME VALUE2  VALUE3
0  Alpha     A1     999
1  Alpha     A1     888
2  Alpha     A2     777
   VALUE1 VALUE2  VALUE3
0     100     A1     999
1     100     A1     888
2     200     A2     777
    NAME  VALUE1 VALUE2  VALUE3
0  Alpha     100     A1     999
1  Alpha     100     A1     888
2  Alpha     200     A2     777