I have to parse an excelsheet based on column data. for example I have 4 comma separated values in 'column 2', I want to print 4 lines, each line containing 1 value from column 2.
I am getting output like this.
1,2,abc,def,ghi,jkl,11,10,10
3,4,abc,def,ghi,jkl,12,12,11
expected output would be like this.
1,2,abc,11,10,10
1,2,def,11,10,10
1,2,ghi,11,10,10
1,2,jkl,11,10,10
3,4,abc,12,12,11
3,4,def,12,12,11
3,4,ghi,12,12,11
3,4,jkl,12,12,11
I can read whole data but not getting output like this. Please guide me how to achieve this.
CodePudding user response:
Suppose you have this kind of list after reading the excel file:
file_content = [
['1,2', 'abc,def,ghi,jkl', 11, 10, 10],
['3,4', 'abc,def,ghi,jkl', 12, 12, 11]
]
One way to achieve your goal, is to iterate over each value of the column 2 splitted by coma and 'rebuilt' a entire row. With something like:
output = list()
for row in file_content:
for elt in row[1].split(','):
# handle values of column 1 which are also coma separated
tmp = [int(x) for x in row[0].split(',')
# append one value of column 2
tmp.append(elt)
# merge with the end of the row
tmp = row[2:]
output.append(tmp)
EDIT:
To read your excel file you should use the pandas library and adapt the code above as follow:
import pandas
df = pandas.read_excel('yourfile.xlsx', header=None)
output = list()
for idx, row in df.iterrows():
for elt in row[1].split(','):
tmp = [int(x) for x in row[0].split(',')]
tmp.append('elt')
tmp = list(row[2:])
output.append(tmp)