Home > database >  Listing Column Names with Itertools
Listing Column Names with Itertools

Time:07-01

I have this Dataframe:

    STATE CITY      TAX_C   MATERIAL    IG  LIMIT
0   TX    DALLAS    1       CARP        0   5
1   TX    DALLAS    1       BLAY        0   10

And I've created a loop using itertools that takes the combinations of every column from each row:

res = []
for r in range(2,len(df.columns) 1):
    for cols in itertools.combinations(df.columns, r ):
        res  = df[list(cols)].T.to_dict('list').values()
res

And it gives me this output:

[[TX, DALLAS], [TX, DALLAS], [DALLAS, 1], [DALLAS 1], [1, CARP], [1, BLAY], [CARP, 0], [0,5], [TX, 1],...]

I am trying to get an output that prints out the Column name before each value like so:

[[STATE: 'TX', CITY: 'DALLAS'], [STATE: 'TX', CITY: 'DALLAS'], [CITY: 'DALLAS', TAX_C: '1'], [CITY: 'DALLAS', TAX_C: '1'], [TAX_C: '1', MATERIAL: 'CARP']...]

CodePudding user response:

So I reproduced your data like so

data = [["TX", "TX"], ["DALLAS", "DALLAS"], [1, 1], ["CARP", "BLAY"], [0,0], [5,15]]
df = pd.DataFrame(data).T
df.columns=["STATE", "CITY", "TAX_C", "MATERIAL", "IG", "LIMIT"]

And I think the first step is to go a little deeper into how you can get dictionary out of the dataframe

for key, value in df.to_dict(orient="index").items():
    print(value)

Which yields

{'STATE': 'TX', 'CITY': 'DALLAS', 'TAX_C': 1, 'MATERIAL': 'CARP', 'IG': 0, 'LIMIT': 5}
{'STATE': 'TX', 'CITY': 'DALLAS', 'TAX_C': 1, 'MATERIAL': 'BLAY', 'IG': 0, 'LIMIT': 15}

If we go a little deeper you can loop over it and append a list like so

results = []
for key, value in df.to_dict(orient="index").items():
    row = list(value.items())
    for nr in range((len(value)-1)):
        results.append([list(row[nr]), list(row[nr 1])])

yielding

[[['STATE', 'TX'], ['CITY', 'DALLAS']],
 [['CITY', 'DALLAS'], ['TAX_C', 1]],
 [['TAX_C', 1], ['MATERIAL', 'CARP']],
 [['MATERIAL', 'CARP'], ['IG', 0]],
 [['IG', 0], ['LIMIT', 5]],
 [['STATE', 'TX'], ['CITY', 'DALLAS']],
 [['CITY', 'DALLAS'], ['TAX_C', 1]],
 [['TAX_C', 1], ['MATERIAL', 'BLAY']],
 [['MATERIAL', 'BLAY'], ['IG', 0]],
 [['IG', 0], ['LIMIT', 15]]]

Please note that you description is not possible in Python. Something is a list or a dictionary. A list is separated with comma's only.

I hope this helps :)

CodePudding user response:

Try the following code:

res = []
for r in range(2, df.columns.size   1):
    for cols in itertools.combinations(df.columns, r):
        res  = df[list(cols)].T.to_dict().values()
res

The difference is that I dropped the argument from to_dict, so it works with the default orientation of dict.

The initial part of the result is:

[{'STATE': 'TX', 'CITY': 'DALLAS'},
 {'STATE': 'TX', 'CITY': 'DALLAS'},
 {'STATE': 'TX', 'TAX_C': 1},
 {'STATE': 'TX', 'TAX_C': 1},
 {'STATE': 'TX', 'MATERIAL': 'CARP'},
 {'STATE': 'TX', 'MATERIAL': 'BLAY'},
 {'STATE': 'TX', 'IG': 0},
 {'STATE': 'TX', 'IG': 0},
 {'STATE': 'TX', 'LIMIT': 5},
 {'STATE': 'TX', 'LIMIT': 10},

so it is a list of dictionaries, quite similar to your desired result.

  • Related