Writing data from one CSV file to another CSV file using Python-CodePudding

I have a CSV file with two rows of data. The first row are the names like Red, Green, Orange, Purple and they repeat themselves like that. The second row is the data. The format is how I have written below, but in a CSV file. I want to take this and put them in separate columns like I have shown in table 2 but again in a CSV file. How do I combine the similar names and keep the data for it all? I understand I could write them out like this

lista1=["Red", "Green", "Orange", "Purple"] 
lista2=[3,56,23,12,34,65,98,7,9,45,33,15]

and call on them although I have 100's of files like this and I can't change the numbers and titles each time

Table 1:

Red	Green	Orange	Purple	Red	Green	Orange	Purple	Red	Green	Orange	Purple
3	56	23	12	34	65	98	7	9	45	33	15

Table 2 (output):

Red	Green	Orange	Purple
3	56	23	12
34	65	98	7
9	45	33	15

Again the table 1 data is from CSV file and I want the desired output in a CSV file as well.

CodePudding user response：

Since, you do not need pandas in your solution, here is one that only uses csv module.

I read the file using csv.reader() function. Converted the data into dictionary according to the sample input csv file you provided and then converted that dictionary into csv file.

Here is the sample csv input file :-

Red,Green,Orange,Purple,Red,Green,Orange,Purple,Red,Green,Orange,Purple
3,56,23,12,34,65,98,7,9,45,33,15

Now the code:-

import csv
with open('try.csv') as csvfile:
    mixedData = csv.reader(csvfile)
    column,data = mixedData
    
data_dict = {}
for i,name in enumerate(column) :
    if name in data_dict :
        data_dict[name].append(data[i])
    else :
        data_dict[name] = [data[i]]
        

with open("try_output.csv", "w",newline="") as outfile:
    writer = csv.writer(outfile)
    writer.writerow(data_dict.keys())
    writer.writerows(zip(*data_dict.values()))

output file :-

Red,Green,Orange,Purple
3,56,23,12
34,65,98,7
9,45,33,15

CodePudding user response：

This is how I would do it. This assumes that the first row is always going to be words of some kind, and the bottom will always be numbers. As long as that's true, you don't need to know what the words are ahead of time.

First, read the data from the csv file. (I'm not reading it directly into a dataframe because column names need to be unique.)

>>> import pandas as pd
>>> import re

>>> infile = '/path/to/sample.csv'
>>> f = open(infile, 'r')
>>> text = f.read()
>>> print(text)

Red,Green,Orange,Purple,Red,Green,Orange,Purple,Red,Green,Orange,Purple
3,56,23,12,34,65,98,7,9,45,33,15

Then separate out your words and numbers, using regex:

>>> words = re.findall("[a-zA-Z] ", text)
>>> numbers = re.findall("[0-9] ", text)
>>> print(words)
>>> print(numbers)

Create your dataframe:

>>> df = pd.DataFrame({
...     "Words": words,
...     "Numbers": numbers
... })

>>> print(df)

     Words Numbers
0      Red       3
1    Green      56
2   Orange      23
3   Purple      12
4      Red      34
5    Green      65
6   Orange      98
7   Purple       7
8      Red       9
9    Green      45
10  Orange      33
11  Purple      15

Group the words together. (This seems like a convoluted way of doing it, but I couldn't figure out a simpler one.)

>>> words_no_repeats = list(set(words))
>>> new_df = pd.DataFrame()
>>> for w in words_no_repeats:
...     values = df[df['Words']==w]['Numbers'].to_list()
...     temp_df = pd.DataFrame({w: values}, index=range(len(values)))
...     new_df = pd.concat([new_df, temp_df], axis=1)    

print(new_df)

  Orange Green Red Purple
0     23    56   3     12
1     98    65  34      7
2     33    45   9     15

Then save your new dataframe as a csv:

>>> new_df.to_csv('/path/to/new_sample.csv', index=False)

This is what the csv file looks like:

Orange,Green,Red,Purple
23,56,3,12
98,65,34,7
33,45,9,15

I know you said in your comments that you were trying to avoid Pandas, but I don't know of any other way to do the grouping.