I'm having problems in "evolving" a script to clean lines of a TXT, attached example of TXT:
Fri Oct 14 22:27:49.100 EDT
Interface Status Protocol Description
--------------------------------------------------------------------------------
Lo0 up up Loopback0 interface configured by Netmiko
Lo55 up up
Lo100 up up ***MERGE LOOPBACK 100****
Lo111 up up Configured by NETCONF
Nu0 up up
Mg0/RP0/CPU0/0 up up DO NOT TOUCH THIS !
Gi0/0/0/0 admin-down admin-down ANSIBLE NXOS TEST
Gi0/0/0/1 admin-down admin-down test
Gi0/0/0/1.100 admin-down admin-down
Gi0/0/0/2 admin-down admin-down Link to P2 configured by Netmiko
Gi0/0/0/3 up up Configured by Ansible !!!!!!!!
Gi0/0/0/4 up up Updated by Ansible using Jinja Template
Gi0/0/0/5 up up Configured by Ansible !!!!!!
Gi0/0/0/6 admin-down admin-down Updated by Ansible using Jinja Template
Gi0/0/0/6.11 admin-down admin-down
Lo20 admin-down admin-down
Lo22 up up Loopback para pruebas
[K --More-- [KLo69 admin-down admin-down
Gi0/3/3/4 up up A SDH
Gi0/3/3/4.852 up up TMU a Red BIT
[K --More-- [KGi0/3/3/4.853 up up Configured by Ansible !!!!!!
Gi0/3/4/2.256 up up Frontera Cliente A
Gi0/3/4/2.257 up up Frontera Cliente B
[K --More-- [KGi0/3/4/2.261 up up Frontera Cliente C
Te0/7/0/3 admin-down admin-down
Mg0/RP0/CPU0/0 down down
Mg0/RP1/CPU0/0 admin-down admin-down
[KRP/0/RP0/CPU0:ROUTER1#
and the script is as follows:
list_txt = [ruta/"prueba.txt"]
for txt in list_txt:
with open(txt, "r") as f:
lines = f.readlines()
with open(txt, "w") as fw:
for line in lines:
if not re.match("-{5}|\s |([A-Za-z0-9] ( [A-Za-z0-9] ) )", line):
fw.write(line)
With this script I am able to delete the lines of the date above everything, the blank lines and the lines where they are pure hyphens, the problem is that I am trying to add 2 things:
1- Add to the regex that if it contains the word "CPU" so the lines would be deleted:
Mg0/RP0/CPU0/0 down down
Mg0/RP1/CPU0/0 admin-down admin-down
[KRP/0/RP0/CPU0:ROUTER1#
2 - On the other hand, I need to delete that strange addition that is added in some lines, such as:
[K --More-- [KLo69 admin-down admin-down
and make it clean like this:
Lo69 admin-down admin-down
This last one I try to do it through txt.lstrip("[K") but it had no effect, I'm doing it incorrectly and it doesn't work and the Regex I'm not hitting the key either and I can't add the word CPU, I'm not so clear How to generate the Regex clearly.
Ideally, I would like you to be able to add everything to the existing script so as not to complicate things so much, could you give me a hand please?
CodePudding user response:
I don't think regular expressions really make a difference here, except maybe to parse the headings so you can know at which index the data for a given column will start:
spans = []
result = []
with open(txt, "r") as f:
it = iter(f.readlines())
# Skip lines until headings are found
for line in it:
if " " in line:
break
# Get start/end indices of each column
spans = [
m.span()
for m in re.finditer(r"\S \s*", line)
]
spans[-1] = (spans[-1][0], -1)
next(it) # Skip line with just hyphens
# Iterate the real data part of the input
for line in it:
if line.startswith("[K --More--"):
# Get content part of this line
line = line[line.index("[K", 10) 2:]
elif line.startswith("[K") or "/CPU" in line:
continue # Not interested
result.append(tuple(line[start:end].rstrip() for start, end in spans))
# Output what was collected:
for record in result:
print(record)
The output for the given sample file is:
('Lo0', 'up', 'up', 'Loopback0 interface configured by Netmiko')
('Lo55', 'up', 'up', '')
('Lo100', 'up', 'up', '***MERGE LOOPBACK 100****')
('Lo111', 'up', 'up', 'Configured by NETCONF')
('Nu0', 'up', 'up', '')
('Gi0/0/0/0', 'admin-down', 'admin-down', 'ANSIBLE NXOS TEST')
('Gi0/0/0/1', 'admin-down', 'admin-down', 'test')
('Gi0/0/0/1.100', 'admin-down', 'admin-down', '')
('Gi0/0/0/2', 'admin-down', 'admin-down', 'Link to P2 configured by Netmiko')
('Gi0/0/0/3', 'up', 'up', 'Configured by Ansible !!!!!!!!')
('Gi0/0/0/4', 'up', 'up', 'Updated by Ansible using Jinja Template')
('Gi0/0/0/5', 'up', 'up', 'Configured by Ansible !!!!!!')
('Gi0/0/0/6', 'admin-down', 'admin-down', 'Updated by Ansible using Jinja Template')
('Gi0/0/0/6.11', 'admin-down', 'admin-down', '')
('Lo20', 'admin-down', 'admin-down', '')
('Lo22', 'up', 'up', 'Loopback para pruebas')
('Lo69', 'admin-down', 'admin-down', '')
('Gi0/3/3/4', 'up', 'up', 'A SDH')
('Gi0/3/3/4.852', 'up', 'up', 'TMU a Red BIT')
('Gi0/3/3/4.853', 'up', 'up', 'Configured by Ansible !!!!!!')
('Gi0/3/4/2.256', 'up', 'up', 'Frontera Cliente A')
('Gi0/3/4/2.257', 'up', 'up', 'Frontera Cliente B')
('Gi0/3/4/2.261', 'up', 'up', 'Frontera Cliente C')
('Te0/7/0/3', 'admin-down', 'admin-down', '')
I might have missed some of your requirements, but I think with this approach it is easy to adapt.
CodePudding user response:
I add the final solution that I have been able to give it in the most reduced way possible:
for txt in list_txt:
with open(txt, "r") as f:
lines = f.readlines()
with open(txt, "w") as fw:
for line in lines:
if re.match("-{5}|\s |([A-Za-z0-9] ( [A-Za-z0-9] ) )", line) or re.search("CPU", line):
pass
elif "[K" in line:
line2 = line.rindex("[K")
line3 = line[line2 2:]
fw.write(line3)
else:
fw.write(line)