I have 2 files that I am wanting to check against each other and pull out the line that they are contained in.
I was trying to do it with regex but I keep getting this error, presuming it's because I'm reaching into a file rather than just presenting a string directly
File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.9_3.9.2544.0_x64__qbz5n2kfra8p0\lib\re.py", line 248, in finditer
return _compile(pattern, flags).finditer(string)
TypeError: expected string or bytes-like object
This is what I'm using for the regex search
regex = r"\d(.*(10.0.0.0).*)"
with open('test1.txt', 'r') as file1:
test = file1.read().splitlines()
matches = re.finditer(regex, test)
for matchNum, match in enumerate(matches, start=1):
print ("Match {matchNum} was found at {start}-{end}: {match}".format(matchNum = matchNum, start = match.start(), end = match.end(), match = match.group()))
I have also tried this, but still comes up with an error
with open("test1.txt") as file1, open("test2") as file2:
st = set(map(str.rstrip,file1))
for line in file2:
spl = line.split(None, 1)[0]
if spl in st:
print(line.rstrip())
The error is
IndexError: list index out of range
I am trying to match a list of IPs to the output from a router, so the test2 file will look like this
10.0.0.0/8
11.0.0.0/8
12.0.0.0/8
13.0.0.0/8
With the router output looking like
1 X test 10.0.0.0/8 nov/19/2021 13:03:08
2 X test 11.0.0.0/8 nov/19/2021 13:03:08
3 X test 12.0.0.0/8 nov/19/2021 13:03:08
4 X test 13.0.0.0/8 nov/19/2021 13:03:08
I am wanting the entire line from the router to be matched just with the IP rather than having to put the entire expected output
Hope this is enough to go off, still pretty new to all this, cheers
CodePudding user response:
If you have a file with the actual IPs, and you don't need regexes, then you can do something like this
my_str = """
1 X test 10.0.0.0/8 nov/19/2021 13:03:08
2 X test 11.0.0.0/9 nov/19/2021 13:03:08
3 X test 12.0.0.0/12 nov/19/2021 13:03:08
4 X test 13.0.0.0/2 nov/19/2021 13:03:08
5 X test 555.0.0.0/2 nov/19/2021 13:03:08 #expecting this to not be printed
"""
keep_ips = [' 10.0.0.0/8 ', ' 11.0.0.0/9 ', ' 12.0.0.0/12 ', ' 13.0.0.0/2 ']
for line in my_str.split('\n'):
if any(ip in line for ip in keep_ips):
print(line)
I added the padding of spaces in the keep_ips
because otherwise you could match on something like 113.0.0.0/25
, because it contains the substring 13.0.0.0/2
You could refactor this code to read in the lines from the router, then the lines for the IPs, add spaces to either end of the IPs, then use this logic, using any
I hope this works for you now
CodePudding user response:
Full answer is
with open("router_output.txt") as file1, open("list_of_ips.txt") as file2:
ips_to_keep = file2.read().splitlines()
router_lines = file1.read().splitlines()
ips_to_keep = [" " ip " " for ip in ips_to_keep]
for line in router_lines:
if any(ip in line for ip in ips_to_keep):
print(line)
assuming that your file has spaces and not tabs :)