sorting out data from two rows in csv file python-CodePudding

I have two rows of data in row 4 and 5. Row 4 has the titles for the data and row 5 holds the actual data. I want to go ahead and sort them out in any sort of format. I am completely new to python so I don't even know where to start. Its a csv file and I want a output of a csv file as well. This is what the data looks like:

A	B	C	D	A	B	C	D	A	B	C	D
0	1	2	3	4	5	6	7	8	9	10	11

I would like data to look something like this if possible:

A	B	C	D
0	1	2	3
4	5	6	7
8	9	10	11

So I want to sort it out by the titles but since the row is not a header row I dont know what to do. Again the titles "A" "B" "C" "D" are in row 4 and the data 0,1,2,3.... are in row 5. Any help would be appreciated.

CodePudding user response：

You can use pandas to read the csv file and then use pandas.DataFrame to sort the data. Here is a sample code:

import pandas as pd

df = pd.read_csv('file.csv', header=None)
df.columns = df.iloc[3]
df = df.sort_values(by=['A', 'B', 'C', 'D'])
df.to_csv('output.csv', index=False)

CodePudding user response：

You can use a dictionary to store the original data, using the first row as the dictionary keys. Then you can use panda to create your final csv file. Something like this:

from collections import defaultdict
import pandas

# read the two rows 
with open('data.txt') as ifile:
    headers = [name.strip() for name in ifile.readline().split(",")]
    values = [int(value.strip()) for value in ifile.readline().split(",")]

# use a dictionary to store the data, using the 
# names in firt row as dictionary keys
dd = defaultdict(lambda: [])
for name, val in zip(headers, values):
    dd[name].append(val)

# use pandas package to create the csv 
data_frame = pandas.DataFrame.from_dict(dd)
data_frame.to_csv("final.csv", index=False)

I am assuming that your data.txt file contains:

A,B,C,D,A,B,C,D,A,B,C,D
0,1,2,3,4,5,6,7,8,9,10,11