Home > Net >  Problem executing regex with a shell command inside Python program
Problem executing regex with a shell command inside Python program

Time:02-11

I am trying to write a Python script to automate some tasks while the event of high inbound connections to the server.

So a portion of it is to collect all the IPv4 addresses on the server.

The command helps to list out that.

# ip a s eth0 | egrep -o 'inet [0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}' | cut -d' ' -f2
198.168.1.2
198.168.1.3
198.168.1.4

The problem I am facing when executing the shell regex part inside the python script.

#!/usr/bin/env python3
import os
ipv4_regex='[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}'
os_cmd = 'ip a s eth0 | egrep -o 'inet ipv4_regex' | cut -d' ' -f2'
os.system(os_cmd)

But the output is error:

# ./dos_fix.py
  File "./dos_fix.py", line 4
    os_cmd = 'ip a s eth0 | egrep -o 'inet ipv4_regex' | cut -d' ' -f2'
                                     ^

To see if it was due to any whitespace interrupting between the egrep and pipe, I tried to escape those quotes with backslashes, but no luck.

#!/usr/bin/env python3
import os
ipv4_regex='[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}'
os_cmd = 'ip a s eth0 | egrep -o \'inet ipv4_regex\' | cut -d\' \' -f2'
os.system(os_cmd)

What am I missing here?.

CodePudding user response:

Use double quotes and don't escape.

In the regular expression, use a raw string: r'contents'.

os.system has some limitations. subprocess is much better.

#!/usr/bin/env python3
import os

ipv4_regex = r'[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}'
os_cmd = f"ip a s eth0 | egrep -o 'inet {ipv4_regex}' | cut -d' ' -f2"
os.system(os_cmd)

CodePudding user response:

The subprocess you are trying to execute has no idea that ipv4_regex is a Python variable, or what its value might be. You are simply searching for that string literally, i p v 4 etc.

A much better solution anyway is to do as much of the processing as possible in Python.

import subprocess
import re

ipv4_regex=re.compile(r'inet [0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}')

for line in subprocess.run(
    ['ip', 'a', 's', 'eth0'],
    capture_output=True, text=True, 
    check=True
).stdout.split('\n'):
    m = ipv4_regex.search(line)
    if m:
        print(line.split()[1])

(I had to guess a bit what your script is supposed to do, since the syntax you used is invalid.)

Like the os.system documentation already tells you, subprocess is much more versatile, and generally recommended over the bare legacy os.system. In particular, if you want the output of the subprocess to be made available to Python, and not just displayed on the console, you need to use subprocess. It also gives you more control over the subprocess, including the ability to not use a shell when you don't need one. (See also Actual meaning of shell=True in subprocess)

Notice the use or a "raw" string to get the actual backslashes into the regex; Python does its own backslash processing on strings in source code, so you have to double the backslashes to put in a literal backslash in a string if you don't use the r'...' syntax.

If your target platform is Linux, probably a better solution still is to extract the information directly in machine-readable format from the /proc filesystem.

  • Related