Problems with modifying lines in TXT with regex-CodePudding

I'm having problems in "evolving" a script to clean lines of a TXT, attached example of TXT:

Fri Oct 14 22:27:49.100 EDT

Interface          Status      Protocol    Description
--------------------------------------------------------------------------------
Lo0                up          up          Loopback0 interface configured by Netmiko
Lo55               up          up          
Lo100              up          up          ***MERGE LOOPBACK 100****
Lo111              up          up          Configured by NETCONF
Nu0                up          up          
Mg0/RP0/CPU0/0     up          up          DO NOT TOUCH THIS !
Gi0/0/0/0          admin-down  admin-down  ANSIBLE NXOS TEST
Gi0/0/0/1          admin-down  admin-down  test
Gi0/0/0/1.100      admin-down  admin-down  
Gi0/0/0/2          admin-down  admin-down  Link to P2 configured by Netmiko
Gi0/0/0/3          up          up          Configured by Ansible !!!!!!!!
Gi0/0/0/4          up          up          Updated by Ansible using Jinja Template
Gi0/0/0/5          up          up          Configured by Ansible !!!!!!
Gi0/0/0/6          admin-down  admin-down  Updated by Ansible using Jinja Template
Gi0/0/0/6.11       admin-down  admin-down
Lo20               admin-down  admin-down  
Lo22               up          up          Loopback para pruebas
[K --More--           [KLo69               admin-down  admin-down  
Gi0/3/3/4          up          up          A SDH 
Gi0/3/3/4.852      up          up          TMU a Red BIT
[K --More--           [KGi0/3/3/4.853      up          up          Configured by Ansible !!!!!!
Gi0/3/4/2.256      up          up          Frontera Cliente A
Gi0/3/4/2.257      up          up          Frontera Cliente B
[K --More--           [KGi0/3/4/2.261      up          up          Frontera Cliente C
Te0/7/0/3          admin-down  admin-down  
Mg0/RP0/CPU0/0     down        down        
Mg0/RP1/CPU0/0     admin-down  admin-down  
[KRP/0/RP0/CPU0:ROUTER1#

and the script is as follows:

list_txt = [ruta/"prueba.txt"]

for txt in list_txt:

  with open(txt, "r") as f:

    lines = f.readlines()

  with open(txt, "w") as fw:
    for line in lines:

      if not re.match("-{5}|\s |([A-Za-z0-9] ( [A-Za-z0-9] ) )", line):
        fw.write(line)

With this script I am able to delete the lines of the date above everything, the blank lines and the lines where they are pure hyphens, the problem is that I am trying to add 2 things:

1- Add to the regex that if it contains the word "CPU" so the lines would be deleted:

Mg0/RP0/CPU0/0     down        down        
Mg0/RP1/CPU0/0     admin-down  admin-down  
[KRP/0/RP0/CPU0:ROUTER1#

2 - On the other hand, I need to delete that strange addition that is added in some lines, such as:

[K --More--           [KLo69               admin-down  admin-down

and make it clean like this:

Lo69               admin-down  admin-down

This last one I try to do it through txt.lstrip("[K") but it had no effect, I'm doing it incorrectly and it doesn't work and the Regex I'm not hitting the key either and I can't add the word CPU, I'm not so clear How to generate the Regex clearly.

Ideally, I would like you to be able to add everything to the existing script so as not to complicate things so much, could you give me a hand please?

CodePudding user response：

I don't think regular expressions really make a difference here, except maybe to parse the headings so you can know at which index the data for a given column will start:

spans = []
result = []
with open(txt, "r") as f:
    it = iter(f.readlines())    
    # Skip lines until headings are found
    for line in it:
        if "  " in line:
            break
    # Get start/end indices of each column
    spans = [
         m.span()
         for m in re.finditer(r"\S \s*", line)
    ]
    spans[-1] = (spans[-1][0], -1)
    next(it) # Skip line with just hyphens
    # Iterate the real data part of the input
    for line in it:
        if line.startswith("[K --More--"):
            # Get content part of this line
            line = line[line.index("[K", 10)   2:]
        elif line.startswith("[K") or "/CPU" in line:
            continue  # Not interested
        result.append(tuple(line[start:end].rstrip() for start, end in spans))

# Output what was collected:
for record in result:
    print(record)

The output for the given sample file is:

('Lo0', 'up', 'up', 'Loopback0 interface configured by Netmiko')
('Lo55', 'up', 'up', '')
('Lo100', 'up', 'up', '***MERGE LOOPBACK 100****')
('Lo111', 'up', 'up', 'Configured by NETCONF')
('Nu0', 'up', 'up', '')
('Gi0/0/0/0', 'admin-down', 'admin-down', 'ANSIBLE NXOS TEST')
('Gi0/0/0/1', 'admin-down', 'admin-down', 'test')
('Gi0/0/0/1.100', 'admin-down', 'admin-down', '')
('Gi0/0/0/2', 'admin-down', 'admin-down', 'Link to P2 configured by Netmiko')
('Gi0/0/0/3', 'up', 'up', 'Configured by Ansible !!!!!!!!')
('Gi0/0/0/4', 'up', 'up', 'Updated by Ansible using Jinja Template')
('Gi0/0/0/5', 'up', 'up', 'Configured by Ansible !!!!!!')
('Gi0/0/0/6', 'admin-down', 'admin-down', 'Updated by Ansible using Jinja Template')
('Gi0/0/0/6.11', 'admin-down', 'admin-down', '')
('Lo20', 'admin-down', 'admin-down', '')
('Lo22', 'up', 'up', 'Loopback para pruebas')
('Lo69', 'admin-down', 'admin-down', '')
('Gi0/3/3/4', 'up', 'up', 'A SDH')
('Gi0/3/3/4.852', 'up', 'up', 'TMU a Red BIT')
('Gi0/3/3/4.853', 'up', 'up', 'Configured by Ansible !!!!!!')
('Gi0/3/4/2.256', 'up', 'up', 'Frontera Cliente A')
('Gi0/3/4/2.257', 'up', 'up', 'Frontera Cliente B')
('Gi0/3/4/2.261', 'up', 'up', 'Frontera Cliente C')
('Te0/7/0/3', 'admin-down', 'admin-down', '')

I might have missed some of your requirements, but I think with this approach it is easy to adapt.

CodePudding user response：

I add the final solution that I have been able to give it in the most reduced way possible:

for txt in list_txt:

  with open(txt, "r") as f:

    lines = f.readlines()

  with open(txt, "w") as fw:
    for line in lines:

      if re.match("-{5}|\s |([A-Za-z0-9] ( [A-Za-z0-9] ) )", line) or re.search("CPU", line):
        pass
      elif "[K" in line:
        line2 = line.rindex("[K")
        line3 = line[line2 2:]
        fw.write(line3)
      else:
        fw.write(line)