I have a CSV data looking like this:
non alc,cola,fanta
alc,vodka,vine,beer
juice,apple,orange,tomato,cherry
This is an example - real data rows are much longer, maximum row length is around 200 strings.
I want to transform all data to two columns, where each row of the transformed data begins with the first string of the original row. Like this:
non alc,cola
non alc,fanta
alc,vodka
alc,vine
alc,beer
juice,apple
juice,orange
juice,tomato
juice,cherry
What would be the way to do this with Python?
If this kind of transformation has a special name - I would be thankful for knowing it.
CodePudding user response:
You can do it using pandas lib:
import pandas as pd
df = pd.read_fwf('in.txt', header=None)
df = df[0].str.split(',', 1, expand=True)
df[1] = df[1].str.split(',', expand=False)
df = df.explode([1])
df.to_csv('out.txt',index=False,header=None)
CodePudding user response:
I'm not sure if you want to solve this way or not but please check my answer.
<beverage.csv>
non alc,cola,fanta
alc,vodka,vine,beer
juice,apple,orange,tomato,cherry
Python code:
import csv
file = open('./beverage.csv')
csv_reader = csv.reader(file)
for row in csv_reader:
for i in range(1, len(row)):
print(row[0] ', ' row[i])
file.close()
Result:
non alc, cola
non alc, fanta
alc, vodka
alc, vine
alc, beer
juice, apple
juice, orange
juice, tomato
juice, cherry