How to transform rows to two columns, where each new row begins with the first string of the origina-CodePudding

I have a CSV data looking like this:

non alc,cola,fanta
alc,vodka,vine,beer
juice,apple,orange,tomato,cherry

This is an example - real data rows are much longer, maximum row length is around 200 strings.

I want to transform all data to two columns, where each row of the transformed data begins with the first string of the original row. Like this:

non alc,cola
non alc,fanta
alc,vodka
alc,vine
alc,beer
juice,apple
juice,orange
juice,tomato
juice,cherry

What would be the way to do this with Python?

If this kind of transformation has a special name - I would be thankful for knowing it.

CodePudding user response：

You can do it using pandas lib:

import pandas as pd
    
df = pd.read_fwf('in.txt', header=None)
df = df[0].str.split(',', 1, expand=True)
df[1] = df[1].str.split(',', expand=False)
df = df.explode([1])
    
df.to_csv('out.txt',index=False,header=None)

CodePudding user response：

I'm not sure if you want to solve this way or not but please check my answer.

<beverage.csv>
non alc,cola,fanta
alc,vodka,vine,beer
juice,apple,orange,tomato,cherry

Python code:

import csv

file = open('./beverage.csv')
csv_reader = csv.reader(file)

for row in csv_reader:
    for i in range(1, len(row)):
        print(row[0]   ', '   row[i])

file.close()

Result:

non alc, cola
non alc, fanta
alc, vodka
alc, vine
alc, beer
juice, apple
juice, orange
juice, tomato
juice, cherry