I have created a csv file with 5 columns
Machines | VM | Status | Node | Resolve
I want to take all values under Node and Resolve, find the unique values and then remove certain responses(There are some "none" and "record" there which I don't need).
What is the best way to do this?
I was trying to take 1 column at a time and then putting it in sets which did work but is there a quicker way? From the set I was then trying to take away the values I didn't need but realised I was ending up with some values have \n at the end.
Usually I use Pandas which I love to us but I am unable to use this on the machine I am working on at the moment.
unique3=[]
with open("machines.csv", "r") as file:
mach = file.readlines()
for c in mach:
split_lines = c.split(",")[3]
unique3.append(split_lines)
unique4=[]
with open("machines.csv", "r") as file2:
mach2 = file2.readlines()
for c in mach2:
split_lines2 = c.split(",")[4]
unique4.append(split_lines2)
uniqueunique = (set(unique4 unique3))
Any help greatly appreciated, I know this is probably straight forward but I struggle with lists and strings
CodePudding user response:
Something like this:
import csv
with open("machines.csv", "r") as f:
rdr = csv.reader(f)
next(rdr) # skip header if any, otherwise - remove this line
*_, node, resolve = zip(*rdr)
unique = set(node).union(set(resolve))
print(unique)
Then you can remove unwanted values