I'm using python3 and I want to merge few csv files by columns. Is it possible to do without pandas?
For example if I have this two csv
df1:
Name Surname PCName
Max Petrov wrs123
Ivan Ivanov wrs321
df2:
Surname Name PCName
Sidorov Vasily wrs223
Dmitriev Alex wrs331
With pandas I've got this solution:
import os
import pandas as pd # $ pip install pandas
import time
def cls():
os.system('cls' if os.name=='nt' else 'clear')
cls()
today = time.strftime("%y%m%d")
fldpath = 'C:/tmp2/test/'
filepath = fldpath today "_merged.csv"
print(os.listdir(fldpath))
print("type begining of file names")
fmask = input()
file_list = [fldpath f for f in os.listdir(fldpath) if f.startswith(fmask)]
csv_list = []
for file in sorted(file_list):
csv_list.append(pd.read_csv(file).assign(File_Name = os.path.basename(file)))
csv_merged = pd.concat(csv_list, ignore_index=True)
csv_merged.to_csv(filepath, index=False)
CodePudding user response:
You could use a Python DictReader()
and DictWriter()
to do this as follows:
import csv
import os
import time
def cls():
os.system('cls' if os.name=='nt' else 'clear')
cls()
today = time.strftime("%y%m%d")
fldpath = 'C:/tmp2/test/'
filepath = fldpath today "_merged.csv"
print(os.listdir(fldpath))
print("type beginning of file names")
fmask = input()
file_list = [fldpath f for f in os.listdir(fldpath) if f.startswith(fmask)]
with open(filepath, 'w', newline='') as f_output:
csv_output = csv.DictWriter(f_output, fieldnames=["Name", "Surname", "PCName"])
csv_output.writeheader()
for file in sorted(file_list):
with open(file) as f_input:
csv_input = csv.DictReader(f_input)
csv_output.writerows(csv_input)
For your given example, this would produce an output of:
Name,Surname,PCName
Max,Petrov,wrs123
Ivan,Ivanov,wrs321
Vasily,Sidorov,wrs223
Alex,Dmitriev,wrs331
This assumes each CSV file has the same field names (order is not important)