Sorry, I have updated my question. I have two files file1.txt and file2.txt and their respective data is as follow:
file1.txt:
admin:admin
admin:meunsm
admin:12345
sequence in file1.txt is:
username:password
file2.txt:
192.168.0.114:1137 > 192.168.0.193:21 csanders:echo
sequence in file2.txt is:
source ip:source port > destination ip:destination port username:password
Now, what I want from python is to just compare these files and extract the username only. If username in file1.txt doesn't exist in file2.txt, then that username must store in a new text file. Here I have updated my question with .txt files data. Also there can be hundred of thousands rows in these both files and for loop should be use in this case because I want to save the username in my database table.
I have picked this code sample from Stack overflow Where both files are compared at same time, if there is any common data in both files that data will write in a new file:
sample:
with open('file1.txt') as file1:
with open('file2.txt') as file2:
newfile = open('newfile.txt','w')
common_lines = set(file1.readlines()) & set(file2.readlines())
for line in common_lines:
newfile.write(line)
newfile.close()
but my scenario is quiet different. I want if data in file1.txt is not in file2.txt then that data must store in newfile.txt. Just I want to compare two files at same time and I want if data in file1.txt doesn't exist in file2.txt, so that data must be stored in newfile
CodePudding user response:
It is quite easy with a for loop.
with open('file1.txt') as file1:
with open('file2.txt') as file2:
newfile = open('newfile.txt','w')
different_lines = []
for line1 in file1.readlines():
if line1 not in file2.readlines():
different_lines.append(line1)
for line in different_lines:
newfile.write(line)
newfile.close()
You can also make it better with python list comprehension.
with open('file1.txt') as file1:
with open('file2.txt') as file2:
newfile = open('newfile.txt','w')
different_lines = [l1 for l1 in file1.readlines() if l1 not in file2.readlines()]
for line in different_lines:
newfile.write(line)
newfile.close()
CodePudding user response:
You can use symmetric_difference
.
Something like:
with open('file1.txt') as file1, \
open('file2.txt') as file2, \
open('newfile.txt', 'w') as newfile:
diff = set(file1.readlines()).symmetric_difference(file2.readlines())
for line in diff:
newfile.write(f"{line.strip().split(':')[0]}\n")
Note: using set
does not guarantee the order of lines.
CodePudding user response:
Here is the simple code and it will maintain the order (memory efficient), I will work on huge files also as it is using iterator
with open("file1.txt") as fp1, open("file2.txt") as fp2, open("newfile.txt", "w") as fp3:
i = 0
k = 0
while True:
try:
if i == 0:
# at first get line from both file
l1 = next(fp1)
l2 = next(fp2)
# if both the line is equal get another line
if l1 == l2:
try:
l1 = next(fp1)
except StopIteration:
break
l2 = next(fp2)
# if line are not equal then put l1 in new file
else:
fp3.write(l1)
try:
l1 = next(fp1)
except StopIteration:
break
i = 1
except StopIteration:
k = 1
if k == 2:
break
except Exception as e:
print(e)
break