I have a txt file with lines of text like this, and I want to swap the word in quotations with the last word that is separated from the sentence with a tab:
it looks like this:
This "is" a person are
She was not "here" right
"The" pencil is not sharpened a
desired output:
This "are" a person is
She was not "right" here
Some ideas:
#1: Use Numpy
- Seperate all the words by whitespace with numpy->
['This','"is"','a','person',\t,'are']
Problems:
- How do I tell python the position of the quoted word
- How to convert the list back to normal text. Concatenate all?
#2: Use Regex
- Use regex and find the word in
""
with open('readme.txt','r') as x:
x = x.readlines()
swap = x[-1]
re.findall(\"(\w )\", swap)
Problems:
- I don't know what to read the txt file with regex. most examples I see here will assign the entire sentence to a variable. Is it something like this?
with open('readme.txt') as f:
lines = f.readlines()
lines.findall(....)
Thanks guys
CodePudding user response:
You don't really need re for something this trivial.
Assuming you want to rewrite the file:
with open('foo.txt', 'r ') as txt:
lines = txt.readlines()
for k, line in enumerate(lines):
words = line.split()
for i, word in enumerate(words[:-1]):
if word[0] == '"' and word[-1] == '"':
words[i] = f'"{words[-1]}"'
words[-1] = word[1:-1]
break
lines[k] = ' '.join(words[:-1]) f'\t{words[-1]}'
txt.seek(0)
print(*lines, sep='\n', file=txt)
txt.truncate()
CodePudding user response:
This is my solution:
regex = r'"[\s\S]*"'
import re
file1 = open('test.txt', 'r')
count = 0
while True:
# Get next line from file
line = file1.readline()
# if line is empty
# end of file is reached
if not line:
break
get_tab = line.strip().split('\t')[1]
regex = r'\"[\s\S]*\"'
print("original: {} mod ----> {}".format(line.strip(), re.sub(regex, get_tab, line.strip().split('\t')[0])))
CodePudding user response:
Try:
import re
pat = re.compile(r'"([^"]*)"(.*\t)(.*)')
with open("your_file.txt", "r") as f_in:
for line in f_in:
print(pat.sub(r'"\3"\2\1', line.rstrip()))
Prints:
This "are" a person is
She was not "right" here
"a" pencil is not sharpened The
CodePudding user response:
I guess this is also a way to solve it:
Input readme.txt contents:
This "is" a person are
She was not "here" right
"The" pencil is not sharpened a
Code:
import re
changed_text = []
with open('readme.txt') as x:
for line in x:
splitted_text = line.strip().split("\t") # ['This "is" a person', 'are'] etc.
if re.search(r'\".*\"', line.strip()): # If a quote is found
qouted_text = re.search(r'\"(.*)\"', line.strip()).group(1)
changed_text.append(splitted_text[0].replace(qouted_text, splitted_text[1]) "\t" qouted_text)
with open('readme.txt.modified', 'w') as x:
for line in changed_text:
print(line)
x.write(line "\n")
Result (readme.txt.modified):
Thare "are" a person is
She was not "right" here
"a" pencil is not sharpened The