Home > Software engineering >  Trying to extract all IP addresses from a file using Python
Trying to extract all IP addresses from a file using Python

Time:09-06

noob here.

I'm trying to extract all IP addresses from a file where the IP addresses are all in random positions, with there sometimes being more than one IP on a line. It's a text file with the contents looking like the below:

hello9.9.9.9                     hdi3ohdoi3hoi3oid2         10.3.2.3            2.3.4.5
ddjeijfdeio
eifhoehjwiehfiowe
uiewhduihewiudhue
8.8.8.8, 20.20.20.20
de2hd9j39ud9829d8 192.168.10.24

My code only returns the first ip address on each line,

['9.9.9.9', '8.8.8.8', '192.168.10.24']

My code is below:


with open('C:/testfile.txt') as fh:
    file = fh.readlines()

pattern = re.compile(r'(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})')

lst = []

for line in file:
    match = pattern.search(line)
    if match is not None:
        lst.append(match[0])

print(lst)

Any ideas how I can list all the IP's?

Thanks in advance

J

CodePudding user response:

Used pattern.findall() to find all occurencess. Otherwise it will find first occurences.

import re

with open('testfile.txt') as fh:
    file = fh.readlines()

pattern = re.compile(r'(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})')

lst = []
for line in file:
    match = pattern.findall(line)
    if len(match) != 0:
        lst.extend(match)
print(lst)

output:

['9.9.9.9', '10.3.2.3', '2.3.4.5', '8.8.8.8','20.20.20.20','192.168.10.24']

CodePudding user response:

You can use re.findall for that:

import re

data = """
hello9.9.9.9                     hdi3ohdoi3hoi3oid2         10.3.2.3            2.3.4.5
ddjeijfdeio
eifhoehjwiehfiowe
uiewhduihewiudhue
8.8.8.8, 20.20.20.20
de2hd9j39ud9829d8 192.168.10.24
"""

result = re.findall(r'\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}', data)
print(result)

Result:

['9.9.9.9', '10.3.2.3', '2.3.4.5', '8.8.8.8', '20.20.20.20', '192.168.10.24']

CodePudding user response:

the pattern.search function is only searching for the first occurrence,

so I suggest to use the findall method instead of the search method so your code could look like:


with open('C:/testfile.txt') as fh:
    file = fh.readlines()

pattern = re.compile(r'(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})')

lst = []

for line in file:
    match = re.findall(pattern,text)
    #match = pattern.findall(line)#searching for first one
    if match is not None:
        for x in match:#iter and add
            lst.append(x)
    #or 
        #lst.extend(match)#adds a list to the list

print(lst)
  • Related