I have this Dataframe:
STATE CITY TAX_C MATERIAL IG LIMIT
0 TX DALLAS 1 CARP 0 5
1 TX DALLAS 1 BLAY 0 10
And I've created a loop using itertools that takes the combinations of every column from each row:
res = []
for r in range(2,len(df.columns) 1):
for cols in itertools.combinations(df.columns, r ):
res = df[list(cols)].T.to_dict('list').values()
res
And it gives me this output:
[[TX, DALLAS], [TX, DALLAS], [DALLAS, 1], [DALLAS 1], [1, CARP], [1, BLAY], [CARP, 0], [0,5], [TX, 1],...]
I am trying to get an output that prints out the Column name before each value like so:
[[STATE: 'TX', CITY: 'DALLAS'], [STATE: 'TX', CITY: 'DALLAS'], [CITY: 'DALLAS', TAX_C: '1'], [CITY: 'DALLAS', TAX_C: '1'], [TAX_C: '1', MATERIAL: 'CARP']...]
CodePudding user response:
So I reproduced your data like so
data = [["TX", "TX"], ["DALLAS", "DALLAS"], [1, 1], ["CARP", "BLAY"], [0,0], [5,15]]
df = pd.DataFrame(data).T
df.columns=["STATE", "CITY", "TAX_C", "MATERIAL", "IG", "LIMIT"]
And I think the first step is to go a little deeper into how you can get dictionary out of the dataframe
for key, value in df.to_dict(orient="index").items():
print(value)
Which yields
{'STATE': 'TX', 'CITY': 'DALLAS', 'TAX_C': 1, 'MATERIAL': 'CARP', 'IG': 0, 'LIMIT': 5}
{'STATE': 'TX', 'CITY': 'DALLAS', 'TAX_C': 1, 'MATERIAL': 'BLAY', 'IG': 0, 'LIMIT': 15}
If we go a little deeper you can loop over it and append a list like so
results = []
for key, value in df.to_dict(orient="index").items():
row = list(value.items())
for nr in range((len(value)-1)):
results.append([list(row[nr]), list(row[nr 1])])
yielding
[[['STATE', 'TX'], ['CITY', 'DALLAS']],
[['CITY', 'DALLAS'], ['TAX_C', 1]],
[['TAX_C', 1], ['MATERIAL', 'CARP']],
[['MATERIAL', 'CARP'], ['IG', 0]],
[['IG', 0], ['LIMIT', 5]],
[['STATE', 'TX'], ['CITY', 'DALLAS']],
[['CITY', 'DALLAS'], ['TAX_C', 1]],
[['TAX_C', 1], ['MATERIAL', 'BLAY']],
[['MATERIAL', 'BLAY'], ['IG', 0]],
[['IG', 0], ['LIMIT', 15]]]
Please note that you description is not possible in Python. Something is a list or a dictionary. A list is separated with comma's only.
I hope this helps :)
CodePudding user response:
Try the following code:
res = []
for r in range(2, df.columns.size 1):
for cols in itertools.combinations(df.columns, r):
res = df[list(cols)].T.to_dict().values()
res
The difference is that I dropped the argument from to_dict, so it works with the default orientation of dict.
The initial part of the result is:
[{'STATE': 'TX', 'CITY': 'DALLAS'},
{'STATE': 'TX', 'CITY': 'DALLAS'},
{'STATE': 'TX', 'TAX_C': 1},
{'STATE': 'TX', 'TAX_C': 1},
{'STATE': 'TX', 'MATERIAL': 'CARP'},
{'STATE': 'TX', 'MATERIAL': 'BLAY'},
{'STATE': 'TX', 'IG': 0},
{'STATE': 'TX', 'IG': 0},
{'STATE': 'TX', 'LIMIT': 5},
{'STATE': 'TX', 'LIMIT': 10},
so it is a list of dictionaries, quite similar to your desired result.