data1 = open("a.txt", 'r').readline().strip().split(',')
data2 = open("b.txt", 'r').readline().strip().split(',')
res = list((set(data1) - set(data2)).union(set(data2) - set(data1)))
how can I do the above but with multiple txt files? they are comma seperated, and in lines. I want to find matching words given data1.
data1 == data2
data1 == data3
data1 == data4
...
From:compare two file and find matching words in python
with open("file1") as f1,open("file2") as f2:
words=set(line.strip() for line in f1) #create a set of words from dictionary file
#why sets? sets provide an O(1) lookup, so overall complexity is O(N)
#now loop over each line of other file (word, freq file)
for line in f2:
word,freq=line.split() #fetch word,freq
if word in words: #if word is found in words set then print it
print word
output the same words found in both files:
apple
The output in my case should be words found in those files that are also found in data1... any idea is much appreciated!
CodePudding user response:
The code you already have does pretty much everything you want. I couldn't understand exactly what you wanted but I guessed one of two things
You want the words present in data1 & data2, data1 & data3, data1 & data4
For this, you can store the words in a single list instead of printing them and run the same loop for all multiple files
You want the words present in data1 & data2 & data3 & data4 For this, you can store the words for each, but in separate lists, and get the intersection of these lists (you can use a list of lists here for convenience)
Let me know if this works!