I would like to replace multiple strings in a text file and generates multiple text files of it with the for loop.
The following code is what I have done so far but the lines generated are duplicated from the original file (two lines are generated, one for the replacing Column_A and another one for replacing Column_B) instead of replacing the strings that I wanted. Any suggestions so that I can improve my code?
for n in range(1,7):
input = open("C:\Users\...\file.txt", "rt")
output= open("file" str(n) "new.txt", "wt")
for line in input:
output.write(line.replace("Column_A", "Data_A" str(n) "_new"))
output.write(line.replace("Column_B", "Data_B" str(n) "_new"))
output.close()
CodePudding user response:
Each time you call output.write
it will write out a line to the file. What you want is to call the second replace
on the result of the first replace, not on the original line, which is what your current code does.
for line in input:
output.write(line.replace("Column_A", "Data_A" str(n) "_new").replace("Column_B", "Data_B" str(n) "_new"))
line.replace(...)
returns a str
with the substring replaced, so you can chain the invocations, since replace
is a member of str
CodePudding user response:
The function .write()
appends whatever you pass to it to the end of the file.
The line.replace()
function will modify the line
string and replace the content the way you told it to, however, you're not actually modifying the line in the file but rather the string you created in memory in the for loop.
This question already has the solution for replacing a line in a file properly.
CodePudding user response:
Double .replace()
is fine. Nevertheless, here's an alternative with regex:
import re
pattern = re.compile(r"Column_(A|B)")
for n in range(1, 7):
repl = f"Data_\g<1>{n}_new"
with open("C:\Users\...\file.txt", "rt") as fin,\
open(f"file{n}new.txt", "wt") as fout:
fout.writelines(pattern.sub(repl, line) for line in fin)
The re
module is part of the standard library. The pattern
is looking for Column_A
or Column_B
, and captures A
or B
in a group. The repl
-string used in .sub()
uses the capture group (\g<1>
) to make sure the right subscript is used.