I have N files that is delineated by a new-line break. Each file has it's rows/lines sorted in lexicographic order (the rows themselves don't have to be sorted). For example:
Include any error messages \n
Include details about your goal \n
Describe expected and actual results \n
How do I merge all multiple files so that the output file is sorted without loading all files in memory?
While this is not an algorithm problem per-se, it does remind me of the leetcode problem of Merging K Sorted Linked Lists. In this case, a node would be one line in a file.
CodePudding user response:
Try heapq.merge
:
If you have two files:
file1.txt
:
aaa
aab
bbb
ooo
file2.txt
:
ccc
ddd
zzz
Then:
from heapq import merge
files = ["file1.txt", "file2.txt"]
for m in merge(*map(open, files)):
print(m.strip())
Prints:
aaa
aab
bbb
ccc
ddd
ooo
zzz