Home > OS >  Python Regex for Negative Lookaround for multiline content
Python Regex for Negative Lookaround for multiline content

Time:05-30

I am new to Python. I am working on LaTeX file, which contain lot of Math, Programming code, etc. I have replace multiple space " " by " ". But i need to ignore in certain part of my code. For example:

Normal Text: "Hai, I am New to Python". I have replace multi space by single space by "Hai, I am New to Python". This regex was applied to whole document. But i need to ignore multi space in certain LaTeX Tag. For example

Hai, I am    New to       Python
\begin{lstlisting}[title=Sample]
      print("Hai, I am    New to       Python")
      def Code(a):
          print(a)
      Code("Hai, i am new to Perl")
\end{lstlisting}

After my code multi space was changed to single space between \begin{lstlisting} to \end{lstlisting}

"Hai, I am New to Python"
\begin{lstlisting}[title=Sample]
 print("Hai, I am New to Python")
 def Code(a):
 print(a)
 Code("Hai, i am new to Perl")
\end{lstlisting}

How to ignore python regex between \begin{lstlisting} to \end{lstlisting}?

CodePudding user response:

A proper LaTeX parser is the way to go but this may be a 'good enough' solution. See what you think.

import re

text = '''
Hai, I am    New to       Python
\\begin{lstlisting}[title=Sample]
      print("Hai, I am    New to       Python")
      def Code(a):
          print(a)
      Code("Hai, i am new to Perl")
\\end{lstlisting}
'''
  
text = re.sub(r'  (?!(?:(?!\\begin\{lstlisting\}).)*\\end\{lstlisting\})', ' ', text, flags=re.DOTALL)

print(text)

It works by not replacing the spaces if \end{lstlisting} appears ahead in the string without \begin{lstlisting} appearing before it.

  • Related