I am working with a spreadsheet where I have to combine all possible combinations of comma-separated values into their own lines. Below is the main logic for the current implementation.
sheet_data = []
for sheetname, sheetdata in excel_content.items():
for irow, row in enumerate(sheetdata.values):
for a in str(row[5]).split(','):
for b in str(row[17]).split(','):
for c in str(row[21]).split(','):
for d in str(row[24]).split(','):
for e in str(row[27]).split(','):
for f in str(row[33]).split(','):
sheet_data.append(
[row[0], row[1], row[2], row[3], row[4], a, row[6], row[7], row[8], row[9], row[10],row[11], row[12], row[13], row[14], row[15], row[16], b, row[18], row[19], row[20],c, row[22], row[23], d, row[25], row[26], e, row[28], row[29], row[30], row[31],row[32], f, row[34], row[35], row[36], row[37], row[38], row[39], row[40], row[41],row[42], row[43], row[44], row[45], row[46], row[47], row[48], row[49], row[50],row[51], row[52], row[53], row[54], row[55], row[56], row[57], row[58]])
Is there any way I could re-write this to be more re-usable and dynamic? What if I need to add a new column to also combine? That'd mean that I would need to create a new for-loop.
CodePudding user response:
Simple example of traversing this way for understanding:
import itertools
a = [['1,2', '3', '4,5,6', '7,8']]
for val in a:
product_items = itertools.product(
val[0].split(','),
val[2].split(','),
val[3].split(','),
)
product = list(itertools.product(product_items))
for ss in product:
ss0 = ss[0]
print(ss0[0], val[1], ss0[1], ss0[2])
Returns:
1 3 4 7
1 3 4 8
1 3 5 7
1 3 5 8
1 3 6 7
1 3 6 8
2 3 4 7
2 3 4 8
2 3 5 7
2 3 5 8
2 3 6 7
2 3 6 8
So in your case you can do pretty much the same:
import itertools
sheet_data = []
for sheetname, sheetdata in excel_content.items():
for irow, row in enumerate(sheetdata.values):
product_items = itertools.product(
str(row[5]).split(','),
str(row[17]).split(','),
str(row[21]).split(','),
str(row[24]).split(','),
str(row[27]).split(','),
str(row[33]).split(','),
)
product = list(itertools.product(product_items))
for ss in product:
a, b, c, d, e, f = ss[0]
sheet_data.append([
row[0], row[1], row[2], row[3], row[4], a, row[6], row[7],
row[8], row[9], row[10], row[11], row[12], row[13], row[14],
row[15], row[16], b, row[18], row[19], row[20], c, row[22],
row[23], d, row[25], row[26], e, row[28], row[29], row[30],
row[31], row[32], f, row[34], row[35], row[36], row[37],
row[38], row[39], row[40], row[41], row[42], row[43],
row[44], row[45], row[46], row[47], row[48], row[49],
row[50], row[51], row[52], row[53], row[54], row[55],
row[56], row[57], row[58]]
)
And it also could be improved in terms of readability.
- Block
str(row[5]).split(','), str(row[17]).split(','), ...
could be replaced with[str(row[i]).split(',') for i in [5, 17, ...]]
- Block
row[0], row[1], row[2], row[3], row[4], a, row[6], ...
could be replaced with*[row[x] for x in range(0, 5)], a, [row[x] for x in range(6, ...)]
CodePudding user response:
you can setup a list of column numbers to expand and use product to obtain the combinations:
import itertools
comb = [5,17,21,24,27,33] # columns to expand / combine
cols = 59 # row width
for row in sheetdata.values:
cells = (cell.split(',') if c in comb else [cell] for cell in row[:cols])
for newRow in itertools.product(*cells):
sheet_data.append(list(newRow))