How to use difflib independently from position?-CodePudding

So, I have this code

import difflib

list1 = ["ameixa","bolo","guarana","caju","pizza","maracuja","forro", "coco"]
list2 = ["ameixa","guarana","caju","pizza","maracuja","forro","bolo"]

for line in difflib.unified_diff(list1, list2, fromfile='file1', tofile="file2", lineterm=""):
    print(line)

The problem is: Is returning that:

--- file1
    file2
@@ -1,8  1,7 @@
 ameixa
-bolo
 guarana
 caju
 pizza
 maracuja
 forro
-coco
 bolo

So, as you see, "Bolo" is on both of the lists but it's recognizing as a different element for both lists. How can I compare both of them without taking the position into consideration?

CodePudding user response：

I suggest to transform the two lists into sets and make them again into list format:

list1 = list(set(list1))
list2 = list(set(list2))

Output 1

>>> print(list1)
... ['caju', 'forro', 'coco', 'ameixa', 'pizza', 'maracuja', 'bolo', 'guarana']
>>> print(list2)
... ['caju', 'forro', 'ameixa', 'pizza', 'maracuja', 'bolo', 'guarana']

And when we apply the script we get:

for line in difflib.unified_diff(list1, list2, fromfile='file1', tofile="file2", lineterm=""):
    print(line)

Output 2

--- file1
    file2
@@ -1,6  1,5 @@
 caju
 forro
-coco
 ameixa
 pizza
 maracuja

CodePudding user response：

Sort the lists before running difflib.unified_diff.

for line in difflib.unified_diff(sorted(list1), sorted(list2), fromfile='file1', tofile="file2", lineterm=""):
    print(line)

Output:

--- file1
    file2
@@ -1,7  1,6 @@
 ameixa
 bolo
 caju
-coco
 forro
 guarana
 maracuja

EDIT:

This is a solution without using difflib.unified_diff, which is maybe simpler. From the question is not so clear what is exactly what you want as output, though:

# Find elements in list1 that are missing in list2
one_not_two = set(list1).difference(list2)
# Find elements in list2 that are missing in list1
two_not_one = set(list2).difference(list1)
# Find elements in common
one_and_two = set(list1).intersection(list2)