Home > Blockchain >  How do I split between two hashes for desired output in Python
How do I split between two hashes for desired output in Python

Time:12-03

To begin I want a list of all unique IPs in this file.

here is part of the file im reading in python:

[node1] - 190.223.252.106 - User Successful Login

[node2] - 239.84.157.20 - User Successful Profile Picture Upload

[node2] - 87.130.185.37 - User Successful Login

[node6] - 210.155.211.219 - User Successful Payment

[node5] - 64.103.198.103 - User Successful Login

My Code:

def UniqueIP(fileparm):
counter = 0
input_file = open(fileparm, 'r')
file_contents = input_file.read()
input_file.close()
ip_list = file_contents.split()
unique_ip = set(ip_list)
for ip in unique_ip:
    counter  = 1
    print(str(counter)   ': '   str(ip)   "\n")

I have a good start but my output looks like this below. I'm getting IPs mainly but also random pieces of the rest of the contents at times. I just want to be able to split the '-' and grab the IPs as output only.

29: 191.219.189.162

30: [node3]

31: 21.147.6.59

32: 55.160.104.8

CodePudding user response:

You need to iterate over every line:

unique_ips = set()
with open("path/to/file", "r", encoding="utf-8") as file:
  for line in file:
    line_parts = line.split("-", maxsplit=2)
    if len(line_parts) > 2:
      ip = line_parts[1]
      # Maybe you'd want to check if it's an IP here
      # if is_ip(ip):
      unique_ips.add(ip)

then you can iterate over the set

for index, ip in enumerate(unique_ips):
  print(f"{index 1}: {ip}")

Before adding an IP to a set, I would also validate that it is in fact an IP - that it has exactly 4 bytes (between 0 and 255) separated by a dot:

def is_ip(some_str):
  try:
    bvalues = list(map(int, some_str.split(".")))
  except ValueError:
    # Some of the stuff couldn't be parsed into int
    return False
  return all(0<=val<=255 for val in bvalues) and len(bvalues) == 4

(just make sure to declare this function before the rest of the code)

CodePudding user response:

If the lines are always the same, with a - before and after the ip address, in that position, then you can use split with a specific character, select the appropriate element, then strip to remove the extra spaces

x = "node1] - 190.223.252.106 - User Successful Login"
x.split('-')[1].strip()
# 190.223.252.106

However if there is more variation, you may be better using a regular expression to specifically match the IP address.

  • Related