Home > Software engineering >  Loop through files in a folder and create a new merged text file
Loop through files in a folder and create a new merged text file

Time:10-04

I am working on merging a number of text files together into a single text document. I am able to read all the file names and create a new output document.

However, when I output the document, I am only getting the data from one file and not the rest? Overall it should be close to 1 million lines in a txt, but only getting the first 10k

import os

projpath1 = 'PATH1'
projpath2 = 'PATH2'

for root, dirs, files in os.walk(f"{projpath1}", topdown=False):
    for name in files:
        if not name.startswith('.DS_Store'):
            split = name.split("/")
            title = split[0]
            filename = (os.path.join(root, name))
            inputf = os.path.expanduser(f'{projpath1}/{title}')
            updatedf = os.path.expanduser(f'{projpath2}/ENC_merged.txt')

            with open(inputf, "r") as text_file, open(updatedf, 'w') as outfile:
                for info in text_file:
                        for lines in info:
                            outfile.write(lines)

I really am stuck and can't figure it out :/

CodePudding user response:

You are suppose to open create output file first and within it you need to save all the input files, something like this should work for you.

import os

projpath1 = 'PATH1'
projpath2 = 'PATH2'
with open(updatedf, 'w') as outfile:
    for root, dirs, files in os.walk(f"{projpath1}", topdown=False):
        for name in files:
            if not name.startswith('.DS_Store'):
                split = name.split("/")
                title = split[0]
                filename = (os.path.join(root, name))
                inputf = os.path.expanduser(f'{projpath1}/{title}')
                updatedf = os.path.expanduser(f'{projpath2}/ENC_merged.txt')
                with open(inputf, "r") as text_file:
                    for info in text_file:
                        for lines in info:
                            outfile.write(lines)

CodePudding user response:

What about doing it with bash

ls | xargs cat > merged_file
  • Related