I'm pulling data from a network switch and it comes out as a string like this.
Gi1/0/1 COMPUTER1 Full 1000 Auto Down off A (1),5-7777
Gi1/0/2 COMPUTER2 Full 1000 Auto Down On T (1),5-7777
Gi1/0/3 COMPUTER3 Full 1000 Auto Up Off A (1),5-7777
Gi1/0/4 COMPUTER4 Full 1000 Auto Down Off A (1),5-7777
Gi1/0/5 COMPUTER5 Full 1000 Auto Down Off A 1
Gi1/0/6 COMPUTER6 Full 1000 Auto Up On T (1),5-7777
Gi1/0/7 COMPUTER7 N/A Unknown Auto Down Off A 1
Gi1/0/8 COMPUTER8 Full 1000 Auto Up Off A 1
Gi1/0/9 COMPUTER9 Full 1000 Auto Up On T (1),5-7777
Gi1/0/10 COMPUTER10 Full 1000 Auto Up On T (1),5-7777
Gi1/0/11 COMPUTER11 Full 1000 Auto Up On T (1),5-7777
Gi1/0/12 COMPUTER12 Full 1000 Auto Up On T (1),5-7777
Gi1/0/13 COMPUTER13 Full 1000 Auto Up On T (1),5-7777
Gi1/0/14 COMPUTER14 Full 1000 Auto Up On T (1),5-7777
Gi1/0/15 Server1 N/A Unknown Auto Down Off A 55
Gi1/0/16 Server2 N/A Unknown Auto Down Off
Gi1/0/17 Server3 N/A Unknown Auto Down Off
Gi1/0/18 Server4 N/A Unknown Auto Down Off
Gi1/0/19 Server5 Full 1000 Auto Up On T (1),5-7777
Gi1/0/20 Server6 Full 1000 Auto Up On T (1),5-7777
Gi1/0/21 Server7 Full 1000 Auto Up On T (1),5-7777
Gi1/0/22 Server8 Full 1000 Auto Up On A 3311
Gi1/0/23 COMPUTER15 Full 1000 Auto Up Off A 25
Gi1/0/24 COMPUTER16 Full 1000 Auto Up On A 99
Gi1/0/25 COMPUTER17 Full 1000 Auto Up On A 99
Gi1/0/26 Server9 Full 10 Auto Up On A 99
Gi1/0/27 COMPUTER18 Full 10 Auto Up On A 99
Gi1/0/28 N/A Unknown Auto Down Off A 1
Gi1/0/29 N/A Unknown Auto Down Off A 1
Gi1/0/30 N/A Unknown Auto Down Off A 1
Gi1/0/31 N/A Unknown Auto Down Off A 1
Gi1/0/32 N/A Unknown Auto Down Off A 1
Gi1/0/33 N/A Unknown Auto Down Off A 1
Gi1/0/34 N/A Unknown Auto Down Off A 1
Gi1/0/35 N/A Unknown Auto Down Off A 1
Gi1/0/36 N/A Unknown Auto Down Off A 1
Gi1/0/37 N/A Unknown Auto Down Off A 1
Gi1/0/38 N/A Unknown Auto Down Off A 1
Gi1/0/39 N/A Unknown Auto Down Off A 1
Gi1/0/40 N/A Unknown Auto Down Off A 1
Gi1/0/41 N/A Unknown Auto Down Off A 1
Gi1/0/42 N/A Unknown Auto Down Off A 1
Gi1/0/43 N/A Unknown Auto Down Off A 1
Gi1/0/44 N/A Unknown Auto Down Off A 1
Gi1/0/45 N/A Unknown Auto Down Off A 1
Gi1/0/46 N/A Unknown Auto Down Off A 1
Gi1/0/47 N/A Unknown Auto D-Down Off A 1
Gi1/0/48 N/A Unknown Auto Down Off A 1
I'm using TextFSM with Netmiko but I want to know how to format the data without using TextFSM.
I want to convert the data to where I can parse through it like this:
print(port[14]['Description'])
And I would get COMPUTER14
I'm thinking the structure needs to be like this:
{port: {14: {
'Interface': 'Gi1/0/14',
'Description': 'COMPUTER14',
'Duplex': 'Full',
'Speed': '1000',
'Neg': 'Auto',
'Linkstate': 'Up',
'Flowctrl': 'On',
'M': 'T'
'VLAN': ['(1)', '5-7777']
},
15: {
'Interface': 'Gi1/0/15',
'Description': 'SERVER1',
'Duplex': 'N/A',
'Speed': 'Unknown',
'Neg': 'Auto',
'Linkstate': 'Down',
'Flowctrl': 'off',
'M': 'A',
'VLAN': [55]
}
}
}
# VLAN would be a list and anything that doesn't have data would return 'None'
But not sure how to go about this using Python. The most I was able to do was convert to a list using splitlines()
.
Edit: Before I tried doing:
data_list = output.splitlines()
for data in data_list:
print(data.split(' '))
But the list from there came out like this:
['Gi1/0/1', '', '', 'COMPUTER1', 'Full', '', '', '1000', '', '', '', 'Auto', 'Down', '', '', '', '', 'off', '', '', '', 'A', '', '(1),5-7777']
From here I see I need to turn the list into a dictionary, but I don't know how to account for the spaces and also the spaces where there are no data, I would still want to show that there is none.
That data comes out like this:
['Gi1/0/28', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', 'N/A', '', '', '', 'Unknown', 'Auto', 'Down', '', '', 'Off', '', '', 'A', '', '1']
Going forward I tired:
keys = ["Interface", "Description", "Duplex", "Speed", "Neg", "Linkstate", "Flowctrl", "M", "VLAN"]
for data in data_list:
z = zip(keys, data.split(' '))
dictionary = dict(z)
print(dictionary)
Although this makes a dictionary, the keys don't match the values correctly because of the spaces.
{'Interface': 'Gi1/0/1', 'Description': '', 'Duplex': '', 'Speed': 'COMPUTER1', 'Neg': 'Full', 'Linkstate': '', 'Flowctrl': '', 'M': '1000', 'VLAN': ''}
{'Interface': 'Gi1/0/28', 'Description': '', 'Duplex': '', 'Speed': '', 'Neg': '', 'Linkstate': '', 'Flowctrl': '', 'M': '', 'VLAN': ''}
How do I account for the spaces, or am I going the wrong direction here?
CodePudding user response:
Going by the conversation on the chat, text parsing seems to be the only way forward. I copied the entire text and saved it on to a file because I assume you have the output of the command stored in a file. For the sake of length of answer though, I tried with running only a few ports instead of all 48. Also Note that this works only if every column has atleast one row with data. This fails if there is a column for which no port has any data
Instead of using readline, I used read()
so that I could split it at \n
. This essentially removed the \n
at the end of each entry in lines when using readlines()
with open('port_data.txt', 'r') as file:
contents = file.read()
lines = contents.split('\n')
noOfLine = len(lines)
We also maintain a list of column descriptions which I believe will not change in the future for the command in question.
columns = ['Interfaces', 'Description', 'Duplex', 'Speed', 'Neg', 'LinkState', 'Flow Control', 'M', 'VLAN']
colSize = len(columns)
What I do is keep a list for each column content so that it is easy to iterate over to get all the data.
# This is a list of lists.
# finalList[0] is a list of all ports
# finalList[1] is a list of all descriptions, etc.
# finalList will have 9 lists because we have 9 columns here.
finalList = []
# This is the output dictionary
output = {}
To construct the dictionary, we split each line in the lines[]
at a ' '
. This would ensure that the first index of the list generated by the split()
has the data you need. Once we obtain that data, we need to find out where the next column starts.
We know that the next column starts at some spaces after the longest entry in that column. For example, if the name in row1 is apple and name in row2 is photosynthesis, then we know that there is more space after the first word in the row1 to accommodate for the long word in the same column in row2. To find this size, we track the maximum length for the column in a variable. And we do this for every column.
for i in range(len(columns)): # Perform the loop for every column
resultList = [] # This is used to keep the list of every entry in the current column
maxSize = 0 # This is the max length of an entry in the current column. We compute this at run time.
for line in lines:
line = line.rstrip() # Remove any trailing spaces in the right end.
curString = line.split(' ')[0] # Split and get the first word
maxSize = max(maxSize, len(curString)) # Update the maxSize
resultList.append(curString) # Add this word to the current column list.
finalList.append(resultList) # Add the current column to the list of columns we already created.
Now that we have the length of the biggest word in the current column, we can be sure that for every row, the next column appears at some point after maxSize
to accommodate the spacing. For a line with a non-empty next column, the next word could start with in a few spaces. However, for a line with an empty entry for the next column, there would be more number of spaces.
To accommodate for this, we find the row with the least number of spaces after current maxSize and trim all the strings in the lines list to start from that location. Inside the same main for loop
newLineList = [] # This is going to be the list of lines after we have removed the first entry
minLength = 10000 # Arbitrarily large number
if i != colSize-1: # We don't have to do this for the last column because we won't be processing it anymore.
for line in lines: # Each line
minLength = min(minLength, len(line[maxSize:])-len(line[maxSize:].lstrip())) # len(' Apple') - len('Apple') = 4 meaning there are 4 spaces from the current position of line[maxSize]
# Now that we have the minimum length, we can say for sure that the next column starts at maxSize minLength for every row.
for line in lines:
newLineList.append(line[maxSize minLength:]) # temporary holder for recomputed lines.
lines = [] # Set it to empty
for line in newLineList:
lines.append(line) # Append all the new lines to the original lines set for next iteration.
Now we should have a list like this:
>>> for x in finalList:
>>> ... print(x)
['Gi1/0/1', 'Gi1/0/2', 'Gi1/0/3', 'Gi1/0/4', 'Gi1/0/47', 'Gi1/0/48']
['COMPUTER1', 'COMPUTER2', 'COMPUTER3', 'COMPUTER4', '', '']
['Full', 'Full', 'Full', 'Full', 'N/A', 'N/A']
['1000', '1000', '1000', '1000', 'Unknown', 'Unknown']
['Auto', 'Auto', 'Auto', 'Auto', 'Auto', 'Auto']
['Down', 'Down', 'Up', 'Down', 'D-Down', 'Down']
['off', 'On', 'Off', 'Off', 'Off', 'Off']
['A', 'T', 'A', 'A', 'A', 'A']
['(1),5-7777', '(1),5-7777', '(1),5-7777', '(1),5-7777', '1', '1']
Now that we have each column in a list by itself, constructing the dictionary isn't that difficult.
for i in range(noOfLine): # For each row
result = {} # This is a new entry in the final JSON of ports.
for j in range(colSize): # For each column
if(columns[j] == 'VLAN'): # For VLAN, we need to perform a split to get list output
result[columns[j]] = finalList[j][i].split(',') # If the VLAN is a single entry, we would just get that entry in a list.
else:
result[columns[j]] = finalList[j][i].strip() # Removes any excessive spaces.
name = finalList[0][i] # Get the port name like Gi1/0/1 or Gi1/0/2
output[name.split('/')[-1]] = result # Compute the actual port number. Split at '/' gives ['Gi1','0','1'] from which we take a last entry as the port number.
Full code put together:
import pprint
with open('Data_to_Csv.txt', 'r') as file:
contents = file.read()
lines = contents.split('\n')
noOfLine = len(lines)
columns = ['Interfaces', 'Description', 'Duplex', 'Speed', 'Neg', 'LinkState', 'Flow Control', 'M', 'VLAN']
colSize = len(columns)
finalList = []
output = {}
for i in range(len(columns)):
resultList = []
newLineList = []
maxSize = 0
for line in lines:
line = line.rstrip()
curString = line.split(' ')[0]
maxSize = max(maxSize, len(curString))
resultList.append(curString)
finalList.append(resultList)
minLength = 10000
if i != colSize-1:
for line in lines:
minLength = min(minLength, len(line[maxSize:])-len(line[maxSize:].lstrip()))
for line in lines:
newLineList.append(line[maxSize minLength:])
lines = []
for line in newLineList:
lines.append(line)
for x in finalList:
print(x)
for i in range(noOfLine):
result = {}
for j in range(colSize):
# print("J = {}".format(j))
if(columns[j] == 'VLAN'):
result[columns[j]] = finalList[j][i].split(',')
else:
result[columns[j]] = finalList[j][i].strip()
# print("Columns = {}".format(columns[j]))
# print("result = {}".format(result[columns[j]]))
name = finalList[0][i]
output[name.split('/')[-1]] = result
pprint.pprint(output)
Output of the above code:
{'1': {'Description': 'COMPUTER1',
'Duplex': 'Full',
'Flow Control': 'off',
'Interfaces': 'Gi1/0/1',
'LinkState': 'Down',
'M': 'A',
'Neg': 'Auto',
'Speed': '1000',
'VLAN': ['(1)', '5-7777']},
'2': {'Description': 'COMPUTER2',
'Duplex': 'Full',
'Flow Control': 'On',
'Interfaces': 'Gi1/0/2',
'LinkState': 'Down',
'M': 'T',
'Neg': 'Auto',
'Speed': '1000',
'VLAN': ['(1)', '5-7777']},
'3': {'Description': 'COMPUTER3',
'Duplex': 'Full',
'Flow Control': 'Off',
'Interfaces': 'Gi1/0/3',
'LinkState': 'Up',
'M': 'A',
'Neg': 'Auto',
'Speed': '1000',
'VLAN': ['(1)', '5-7777']},
'4': {'Description': 'COMPUTER4',
'Duplex': 'Full',
'Flow Control': 'Off',
'Interfaces': 'Gi1/0/4',
'LinkState': 'Down',
'M': 'A',
'Neg': 'Auto',
'Speed': '1000',
'VLAN': ['(1)', '5-7777']},
'47': {'Description': '',
'Duplex': 'N/A',
'Flow Control': 'Off',
'Interfaces': 'Gi1/0/47',
'LinkState': 'D-Down',
'M': 'A',
'Neg': 'Auto',
'Speed': 'Unknown',
'VLAN': ['1']},
'48': {'Description': '',
'Duplex': 'N/A',
'Flow Control': 'Off',
'Interfaces': 'Gi1/0/48',
'LinkState': 'Down',
'M': 'A',
'Neg': 'Auto',
'Speed': 'Unknown',
'VLAN': ['1']}}