How to compare the date / date format column of two different CSVs then print out 3rd csv-CodePudding

I need to write a python script that outputs the differences of two csvs into a third csv based on the specific date format, the third csv will hold the differences that are between the two files

#reads both files and puts them into a table 
Id = "ID"

Date = "Date"

with open('example.csv', 'r') as t1, open ('example2.csv', 'r') as t2:

t1.write(Id   Date "\n")
t1.close()

t2.write(Id   Date "\n")
t2.close()

fileone = t1.readlines() 
filetwo = t2.readlines()

#function to write a third file that outputs differences    

with open ('DIFF.csv', 'w') as outfile:

 for line in filetwo:
    
    if line not in fileone:
       
      #wr = csv.writer(outfile, dialect='csv')
        
      #wr.writerow([line.rstrip('\n')])
        
      outfile.write(line)
 
  outfile.close()

print("csv is ready")

CodePudding user response：

If I got this question right, you have 2 files with date in a particular format listed like this (I'll use my local format, but you can specify the format in the code) :

example.csv
20/07/2022 15:01
20/07/2022 15:02
20/07/2022 15:03

And:

example2.csv
20/07/2022 14:02
20/07/2022 15:01
20/07/2022 15:08

You want to retreive the symmetric difference (date that are on one file but not on the other one) of these files in term of date :

output
20/07/2022 15:03
20/07/2022 15:08
20/07/2022 14:02
20/07/2022 15:02

To do so here's the code :

from datetime import datetime

#write the format you desire
my_format = "%d/%m/%Y %H:%M\n"

#function that apply to each line to transform the str to a datetime object
str_to_datetime = lambda line: datetime.strptime(line, my_format)

with open('example.csv', 'r') as t1, open ('example2.csv', 'r') as t2, open ('DIFF.csv', 'w') as outfile:
    first_set, second_set = set(map(str_to_datetime, t1.readlines())), set(map(str_to_datetime, t2.readlines()))
    for date in first_set ^ second_set:
        outfile.write(date.strftime(my_format))