Home > Enterprise >  rearanging csv file in python
rearanging csv file in python

Time:10-26

I've a small project that is a little-bit head-breaking for me as not really a newbie in Python, i do some programming in Python for a few years in the weekend. But i see there are many ways to convert data in a csv file. I'm not sure which one to choose, and do not even really know where to start exactly, but i can do lot of research myself.

Now i have this dataset:

This is what i've got:

This is what i want:

Probably you guys know what module is the best for the given problem, currently I've my eyes on pandas and openpyxl.

Thx in advance!

CodePudding user response:

It doesn't have to be complicated. Given this program:

headings = []
for line in open('x.csv'):
    parts = line.rstrip().split(',')
    if not headings:
        headings = parts
        continue
    for name,item in zip(headings, parts):
        if item.isdigit():
            print(','.join((parts[0], name, item)) )

This input:

x,joe,pete,david,pascal,jonathan,george
joe,*,6,5,4,3,2
pete,1,*,4,5,2,7
david,2,3,*,6,3,2
pascal,3,,2,*,,1
jonathan,,1,,,*,,
george,1,,,,,

Produces this output:

joe,pete,6
joe,david,5
joe,pascal,4
joe,jonathan,3
joe,george,2
pete,joe,1
pete,david,4
pete,pascal,5
pete,jonathan,2
pete,george,7
david,joe,2
david,pete,3
david,pascal,6
david,jonathan,3
david,george,2
pascal,joe,3
pascal,david,2
pascal,george,1
jonathan,pete,1
george,joe,1

CodePudding user response:

Depending on what you may want to do afterwards, a pandas option may be useful. In that case, you could simply use melt. Here, I have renamed the first column names1 but if it has another name, use that.

import pandas as pd

path = 'D:/'
file = 'data.csv'

data = pd.read_csv(path file, delimiter=',')
data = data.rename(columns={'Unnamed: 0':'names1'})

df = pd.melt(data, id_vars=['names1'], var_name='names2')
df = df.dropna(subset=['value'])
df = df[df['value'] != '*']

enter image description here

  • Related