Home > Blockchain >  How to transform rows to two columns, where each new row begins with the first string of the origina
How to transform rows to two columns, where each new row begins with the first string of the origina

Time:05-25

I have a CSV data looking like this:

non alc,cola,fanta
alc,vodka,vine,beer
juice,apple,orange,tomato,cherry

This is an example - real data rows are much longer, maximum row length is around 200 strings.

I want to transform all data to two columns, where each row of the transformed data begins with the first string of the original row. Like this:

non alc,cola
non alc,fanta
alc,vodka
alc,vine
alc,beer
juice,apple
juice,orange
juice,tomato
juice,cherry

What would be the way to do this with Python?

If this kind of transformation has a special name - I would be thankful for knowing it.

CodePudding user response:

You can do it using pandas lib:

import pandas as pd
    
df = pd.read_fwf('in.txt', header=None)
df = df[0].str.split(',', 1, expand=True)
df[1] = df[1].str.split(',', expand=False)
df = df.explode([1])
    
df.to_csv('out.txt',index=False,header=None)

CodePudding user response:

I'm not sure if you want to solve this way or not but please check my answer.

<beverage.csv>
non alc,cola,fanta
alc,vodka,vine,beer
juice,apple,orange,tomato,cherry

Python code:

import csv

file = open('./beverage.csv')
csv_reader = csv.reader(file)

for row in csv_reader:
    for i in range(1, len(row)):
        print(row[0]   ', '   row[i])

file.close()

Result:

non alc, cola
non alc, fanta
alc, vodka
alc, vine
alc, beer
juice, apple
juice, orange
juice, tomato
juice, cherry
  • Related