I am reading a file using:
def readFile():
file = open('Rules.txt', 'r')
lines = file.readlines()
for line in lines:
rulesList.append(line)
rulesList:
['\n', "Rule(F1, HTTPS TCP, ['ip', 'ip'], ['www.google.ca', '8.8.8.8'], 443)\n", '\n', "Rule(F2, HTTPS TCP, ['ip', 'ip'], ['75.2.18.233'], 443)\n", '\n']
My file looks like:
Rule(F1, HTTPS TCP, ['ip', 'ip'], ['www.google.ca', '8.8.8.8'], 443)
Rule(F2, HTTPS TCP, ['ip', 'ip'], ['ip'], 443)
I would like to feed the values to a class I created
class Rule:
def __init__(self, flowNumber, protocol, port, fromIP=[], toIP=[]):
self.flowNumber = flowNumber
self.protocol = protocol
self.port = port
self.fromIP = fromIP
self.toIP = toIP
def __repr__(self):
return f'\nRule({self.flowNumber}, {self.protocol}, {self.fromIP}, {self.toIP}, {self.port})'
newRule = Rule(currentFlowNum, currentProtocol, currentPort, currentFromIP, currentToIP)
to get an output such as:
[F1, HTTPS TCP, ['ip', 'ip'], ['www.google.ca', '8.8.8.8'], 443]
or be able to assign these values to a variable like:
currentFlowNum = F1, currentProtocol = 'HTTPS TCP' , currentPort = 443, currentFromIP = ['ip', 'ip'], currentToIP = ['www.google.ca', '8.8.8.8']
I tried:
for rule in rulesList:
if rule !='\n':
tmp = rule.split(',')
print(tmp)
tmp:
['Rule(F1', ' HTTPS TCP', " ['ip'", " 'ip']", " ['www.google.ca'", " '8.8.8.8']", ' 443)\n']
['Rule(F2', ' HTTPS TCP', " ['ip'", " 'ip']", " ['ip']", ' 443)\n']
Is there a way to not split the commas between [] i.e. I would like the output to look like:
['Rule(F1', ' HTTPS TCP', " ['ip','ip']", " ['www.google.ca', '8.8.8.8']", ' 443)\n']
['Rule(F2', ' HTTPS TCP', " ['ip','ip']", " ['ip']", ' 443)\n']
CodePudding user response:
If you have control over how the data in the file is stored and can replace the single quotes ('
) with double quotes ("
) to make the "list" structures valid JSON, you could use RegExp for this.
A word of caution: unless you are absolutely sure that the format you'll be reading will largely remain the same and is rather inflexible, you're better off storing this data in a more well-established format (as mentioned in the comments) like JSON, YAML, etc. There are so many edge cases that could happen here that rolling your own parser like this objectively suboptimal.
import re
import json
def readFile():
file = open('Rules.txt', 'r')
myRules = []
for line in file.readlines():
match = re.match(r'Rule\((?P<flow_number>[^,] ),\s(?P<protocol>[^,] ),\s(?P<from_ip>\[[^\]] \]),\s(?P<to_ip>\[[^\]] \]),\s(?P<port>[^,)] )\)', line)
if match:
myRules.append(Rule(match.group('flow_number'), match.group('protocol'), match.group('port'), json.loads(match.group('from_ip')), json.loads(match.group('to_ip'))))
return myRules
print(readFile())
# Returns:
# [
# Rule(F1, HTTPS TCP, ['ip', 'ip'], ['www.google.ca', '8.8.8.8'], 443),
# Rule(F2, HTTPS TCP, ['ip', 'ip'], ['ip'], 443)]