Basically, my sample dataset looks like the following:
Machine Type: PC,
IP Address: 10.0.0.1,
Location: Denver,USA
Machine Type: Thin Client,
IP Address: 10.0.0.2,
Location: Seattle, USA
I am trying to group every 3 lines together. So each 3 lines of the data set are related (I.e the PC's IP is 10.0.0.1 and is in Denver, the next device is a thin client in Seattle and so on)
My goal is to have Python put each of these 3 values into their own variables, so that later on in the script (omitted from my code below for brevity), I can pass them to a CSV.
My problem is that one of the variables is losing its value in the if i == 3:
block.
Take a look at the following code to see an example of the issue:
output = """
Machine Type: PC,
IP Address: 10.0.0.1,
Location: Denver,USA
Machine Type: Thin Client,
IP Address: 10.0.0.2,
Location: Seattle, USA
"""
i = 0
output = output.split("\n")
for line in output:
if "PC" in line:
machine = line[-3:-1]
if "Thin Client" in line:
machine = line[-11:-1]
if "IP Address" in line:
address = line[-8:-1]
if "Location" in line:
location = line[-12:-1]
print(location)
i = i 1
if i == 3:
print(machine, address, location)
Traceback (most recent call last):
File "pytest.py", line 37, in <module>
print(machine, address, location)
NameError: name 'location' is not defined
Python is telling me that the location
variable isn't defined, but I don't see how that's true because it was defined on the third iteration of the for loop before it was passed off to the print statement.
I can actually prove it's being defined with the following code:
output = """
Machine Type: PC,
IP Address: 10.0.0.1,
Location: Denver,USA
Machine Type: Thin Client,
IP Address: 10.0.0.2,
Location: Seattle, USA
"""
output = output.split("\n")
for line in output:
if "PC" in line:
machine = line[-3:-1]
if "Thin Client" in line:
machine = line[-11:-1]
if "IP Address" in line:
address = line[-8:-1]
if "Location" in line:
location = line[-12:-1]
print(location)
Output:
Denver,US
Seattle, US
I am really lost here. Does anyone have any ideas as to what might be causing this? Or have any suggestions on how to do this in a better way?
CodePudding user response:
Counting lines is the wrong answer. If you want an output after "Location", then do the output when you get "Location". Also, remember that when you say
output = """
xxx
"""
That has a newline at the beginning, and an extra newline at the end. To eliminate those, you need to use a backslash:
output = """\
xxx"""
It's better to be able to ignore blank lines, like this does.
output = """
Machine Type: PC,
IP Address: 10.0.0.1,
Location: Denver,USA
Machine Type: Thin Client,
IP Address: 10.0.0.2,
Location: Seattle, USA
"""
for line in output.splitlines():
parts = line.split(': ')
if parts[0] == 'Machine Type':
machine = parts[1]
elif parts[0] == "IP Address":
address = parts[1]
elif parts[0] == "Location":
location = parts[1]
print(machine, address, location)
Output:
PC, 10.0.0.1, Denver,USA
Thin Client, 10.0.0.2, Seattle, USA
CodePudding user response:
import re
output = """
Machine Type: PC,
IP Address: 10.0.0.1,
Location: Denver,USA
Machine Type: Thin Client,
IP Address: 10.0.0.2,
Location: Seattle, USA
"""
ids = ['Machine Type', 'IP Address', 'Location']
values = []
for id in ids:
values.append([x.split(': ')[1] for x in re.findall(f"({id}:.*)\n", output)])
for value in zip(*values):
print(*value)
Output:
PC, 10.0.0.1, Denver,USA
Thin Client, 10.0.0.2, Seattle, USA
CodePudding user response:
The code you've written is very vulnerable to breakage if the formatting isn't exactly as expected -- a blank line, stray whitespace, etc can easily throw it off. Since you're using the line number (i
) to decide whether or not location
has been defined, a blank line will mess you up, as will reordering the data, adding another line, etc.
Here's an approach that aims to just turn your format (as far as I can infer it) into a generic list of dictionaries rather than assigning individual pieces of data to named variables:
output = """
Machine Type: PC,
IP Address: 10.0.0.1,
Location: Denver,USA
Machine Type: Thin Client,
IP Address: 10.0.0.2,
Location: Seattle, USA
"""
data = [{}]
for line in output.split("\n"):
if not ":" in line:
continue
line = line.strip().rstrip(",")
k, v = line.split(":")
k = max(k.lower().strip().split(), key=len)
if k in data[-1]:
data.append({})
data[-1][k] = v.strip()
This code isn't aware of any of the particular variables, or know (or care) what order they're in, but it produces your original three variables as entries in a dictionary that you can access any way you like:
print(data)
# [{'machine': 'PC', 'address': '10.0.0.1', 'location': 'Denver,USA'}, {'machine': 'Thin Client', 'address': '10.0.0.2', 'location': 'Seattle, USA'}]
for d in data:
print(list(d.values()))
# ['PC', '10.0.0.1', 'Denver,USA']
# ['Thin Client', '10.0.0.2', 'Seattle, USA']
print([d["machine"] for d in data])
# ['PC', 'Thin Client']