Home > Mobile >  Replace change of line with condition in multiple lines
Replace change of line with condition in multiple lines

Time:11-28

I have a text file, like this:

3/11/21, 6:13 PM - Gil: X1000
3/11/21, 6:15 PM - Sergio: <Media omitted>
3/11/21, 6:19 PM - Sergio: X400
3/11/21, 6:20 PM - Sergio: Los amigos de vonzo en Francia:

1. La Tóxica
2. El brujo vodoo
3. El/La Zoofilic@
3/11/21, 6:20 PM - Sergio: :V
3/11/21, 6:21 PM - Joan :V: JAJAJAJAJA

Most of the lines start with a date/time that is easy to catch with a regular expression.

I would like to delete the change of line when the date/time is not found, I would expect something like (in a new file):

3/11/21, 6:13 PM - Gil: X1000
3/11/21, 6:15 PM - Sergio: <Media omitted>
3/11/21, 6:19 PM - Sergio: X400
3/11/21, 6:20 PM - Sergio: Los amigos de vonzo en Francia: 1. La Tóxica 2. El brujo vodoo 3. El/La Zoofilic@
3/11/21, 6:20 PM - Sergio: :V
3/11/21, 6:21 PM - Joan :V: JAJAJAJAJA

The problem I have is that I'm reading the file as:

        input = open(self.fileName, encoding="utf8" , errors='replace')
        for line in input:
            output.write(re.sub(#SOMETHING))

With this is that I can only read only line at the time and I don't really get how to change the n line with a condition in line n 1.

How can I change line change the n line with a condition in line n 1?

CodePudding user response:

Only write \n when there is datetime

import re

datetime_pattern = '\d{1,2}/\d{1,2}/\d{1,2},\s\d{1,2}:\d{1,2}\s[AP]M'

for line in input:
    have_datetime = bool(re.match(datetime_pattern, line)
    if have_datetime:
        output.write('\n')
    output.write(line.strip('\n'))

CodePudding user response:

with statement is the recommended way to read/write files in Python. We can then read each line, match it with the desired pattern and add newline character accordingly.

import re

datetime_pattern = '\d{1,2}/\d{1,2}/\d{1,2},\s\d{1,2}:\d{1,2}\s[AP]M'
with open(input_file_path, 'r') as infile:
    with open(output_file_path, 'w') as outfile:
        for (line_number, line) in enumerate(infile):
            # We don't need a newline character at the first line
            if line_number > 0 and re.match(datetime_pattern, line):
                outfile.write('\n')
            outfile.write(line.strip('\n'))



            
  • Related