Home > database >  conceneate text files to one string
conceneate text files to one string

Time:02-17

I have several text files that I want to conceneate to one big string. The text files are in the same folder as my jupyter notebook file. I found this solution online, but I have problems understanding it right. It goes like this:

def read_files_into_string(filenames):
    strings = []
    for filename in filenames:
        with open(f'textfile_{filename}.txt') as f:
            strings.append(f.read())
    return '\n'.join(strings)

I understand that we define the function first, than we create an empty stringfile. After this we open our textfiles, say, textfile_1 to textfile_10 as f (a variable?), one after another with a loop.

Now to the part I don't get: Do we read the files (f.read) one after antoher and append it to the string file?

But the part that puzle me the most is the return statement in the end: What's the use of the new line expression '\n'? My text files end with new lines - are we gluing the seperate "strings" together with this character?

CodePudding user response:

In this code, filenames is the list of filenames, strings is the files content array (so not the best naming for it).

When you iterate over filenames you are picking each filename from filenames list and open a file named textfile_<filename>.txt using

with open(..) as f:

That construction is called (synchronous) context manager. The point of using it is that when exception occurs file will be safely closed. So you are iterating over files and each time you are adding the content of file to strings list

The last thing is '<delimeter>'.join(iterable_str: typing.Iterable[str]).

That means join iterable of strings (e.g. list of strings) with <delimeter>, so it will return 1 string

So

print('\n'.join(strings))

Gets all strings from strings and concatenates them with '\n' beetween them, and then prints resulting string

This may be represented like this

print(strings[0]   '\n'   strings[1]   '\n'   ...   '\n'   strings(len(strings) - 1))

CodePudding user response:

I cannot comment on your post, so I am writing here. We define an empty list, and then we add the file to list files are separated by ',' into the list. like

 list_test = ['dsag','asdg','sdaga']

So by using the join function, we join all the text files, and the output will join, and each file is separated into a new line; you can change it if you want. For example, you can use the below command if you want them to stick together.

 ''.join(list_test)  # 'dsagasdgsdaga'

CodePudding user response:

I'll try to answer each of your questions:

  1. I understand that we define the function first, than we create an empty stringfile. After this we open our textfiles, say, textfile_1 to textfile_10 as f (a variable?), one after another with a loop.

Yes, f is a variable referencing the file. This is a good answer about how a context manager works.

  1. Now to the part I don't get: Do we read the files (f.read) one after antoher and append it to the string file?

Exactly, you're iterating over the files. For each file, you read its contents (f.read()) as a str and including that string in your strings list.

  1. But the part that puzle me the most is the return statement in the end: What's the use of the new line expression '\n'? My text files end with new lines - are we gluing the seperate "strings" together with this character?

That line of code simply takes every string in your list and joins them using the "\n" string (Python docs on str.join(iterable)). "\n" is a newline character, meaning that this characters means "skip to the next line". Checkout this question and its answers.

  • Related