Home > OS >  Regex deselect anything in the line before match
Regex deselect anything in the line before match

Time:01-23

I need to deselect anything in the line before a certain character. My example looks like this.

28.01.2023 11:00
GERMANY
OWVA2
07.01.2023 06:00
JAPAN
940-750
Laden
28.01.2023 12:00
FRANCE
ANDY
-
07.01.2023 09:30
NORWAY and SWEDEN
-
07.01.2023 09:30
SPAIN

I already have a Regex that selects anything in the line that follows a date (dd.MM.yyyy followed by HH:mm). Now I want to select what follows the date ONLY if the match isn't followed by - or NUMBER-NUMBER in the NEXT line. So basically deselect anything in the line before - or NUMBER-NUMBER. So in this case I want to select Germany, France and Spain. I don't want to select Japan and NORWAY and SWEDEN because they are followed by - or NUMBER-NUMBER.

My regex looks like this:

(?<=\d .\d .\d  \d :\d [\r\n] )([ a-zA-ZäöüÄÖÜßé0-9'-]{3,}) (?![\r\n](\d )?-(\d )?)

Sorry for this weird example but that is the best I have. Thanks in advance

CodePudding user response:

Currently your negative lookahead only prevents matching the last character of the line, because if it's not matched then the match isn't followed by a linefeed and a number dash number anymore.

You can fix that by matching the lookahead before the content you want to capture :

(?<=\d .\d .\d  \d :\d [\r\n] )(?!.*[\r\n](\d )?-(\d )?)([ a-zA-ZäöüÄÖÜßé0-9'-]{3,}) 

You can test it here.

  • Related