Home > Back-end >  Parse and print Excel cell data based on count using python
Parse and print Excel cell data based on count using python

Time:05-21

I have to parse an excelsheet based on column data. for example I have 4 comma separated values in 'column 2', I want to print 4 lines, each line containing 1 value from column 2.

enter image description here

I am getting output like this.

1,2,abc,def,ghi,jkl,11,10,10
3,4,abc,def,ghi,jkl,12,12,11

expected output would be like this.

1,2,abc,11,10,10
1,2,def,11,10,10
1,2,ghi,11,10,10
1,2,jkl,11,10,10
3,4,abc,12,12,11
3,4,def,12,12,11
3,4,ghi,12,12,11
3,4,jkl,12,12,11

I can read whole data but not getting output like this. Please guide me how to achieve this.

CodePudding user response:

Suppose you have this kind of list after reading the excel file:

file_content = [
      ['1,2', 'abc,def,ghi,jkl', 11, 10, 10], 
      ['3,4', 'abc,def,ghi,jkl', 12, 12, 11]
]

One way to achieve your goal, is to iterate over each value of the column 2 splitted by coma and 'rebuilt' a entire row. With something like:

output = list()

for row in file_content:
    for elt in row[1].split(','):
        # handle values of column 1 which are also coma separated
        tmp = [int(x) for x in row[0].split(',')
        # append one value of column 2
        tmp.append(elt)
        # merge with the end of the row
        tmp  = row[2:]

        output.append(tmp)

EDIT:

To read your excel file you should use the pandas library and adapt the code above as follow:

import pandas

df = pandas.read_excel('yourfile.xlsx', header=None)
output = list()

for idx, row in df.iterrows():
    for elt in row[1].split(','):
        tmp = [int(x) for x in row[0].split(',')]
        tmp.append('elt')
        tmp  = list(row[2:])
        output.append(tmp)
  • Related