Home > database >  sorting out data from two rows in csv file python
sorting out data from two rows in csv file python

Time:01-23

I have two rows of data in row 4 and 5. Row 4 has the titles for the data and row 5 holds the actual data. I want to go ahead and sort them out in any sort of format. I am completely new to python so I don't even know where to start. Its a csv file and I want a output of a csv file as well. This is what the data looks like:

A B C D A B C D A B C D
0 1 2 3 4 5 6 7 8 9 10 11

I would like data to look something like this if possible:

A B C D
0 1 2 3
4 5 6 7
8 9 10 11

So I want to sort it out by the titles but since the row is not a header row I dont know what to do. Again the titles "A" "B" "C" "D" are in row 4 and the data 0,1,2,3.... are in row 5. Any help would be appreciated.

CodePudding user response:

You can use pandas to read the csv file and then use pandas.DataFrame to sort the data. Here is a sample code:

import pandas as pd

df = pd.read_csv('file.csv', header=None)
df.columns = df.iloc[3]
df = df.sort_values(by=['A', 'B', 'C', 'D'])
df.to_csv('output.csv', index=False)

CodePudding user response:

You can use a dictionary to store the original data, using the first row as the dictionary keys. Then you can use panda to create your final csv file. Something like this:

from collections import defaultdict
import pandas

# read the two rows 
with open('data.txt') as ifile:
    headers = [name.strip() for name in ifile.readline().split(",")]
    values = [int(value.strip()) for value in ifile.readline().split(",")]

# use a dictionary to store the data, using the 
# names in firt row as dictionary keys
dd = defaultdict(lambda: [])
for name, val in zip(headers, values):
    dd[name].append(val)

# use pandas package to create the csv 
data_frame = pandas.DataFrame.from_dict(dd)
data_frame.to_csv("final.csv", index=False)

I am assuming that your data.txt file contains:

A,B,C,D,A,B,C,D,A,B,C,D
0,1,2,3,4,5,6,7,8,9,10,11
  • Related