Home > Software engineering >  Python: Add unique values from a CSV column to list
Python: Add unique values from a CSV column to list

Time:01-02

I'm going through a CSV that has a list of cargo movements between various ports, and I'd like to take all the unique values for the ports into a new list.

Currently, I have the below, it adds every value under the 'Origin Ports' column, how can I make sure it adds just the unique values under that column? Thank you.

import csv

CSV_FILE = "Bitumen2021Exports.csv"

ports = []
  
with open(CSV_FILE, encoding="utf-8-sig") as bitumen_csv:
    bitumen_reader = csv.DictReader(bitumen_csv)
    for port in bitumen_reader:
        ports.append(port['ORIGIN PORT'])

print(ports)

The data in the CSV looks like below: enter image description here

CodePudding user response:

One way based on your code:

import csv

CSV_FILE = "Bitumen2021Exports.csv"

ports = []
  
with open(CSV_FILE, encoding="utf-8-sig") as bitumen_csv:
    bitumen_reader = csv.DictReader(bitumen_csv)
    for port in bitumen_reader:
        if port['ORIGIN PORTS'] not in ports:
              ports.append(port['ORIGIN PORTS'])

print(ports)

Another way is to import the csv into a pandas df and use column.unique().

CodePudding user response:

You can also skip handling the "uniqueness logic" and use Python's set, which only allows unique elements:

import csv

CSV_FILE = "Bitumen2021Exports.csv"

ports = set()
  
with open(CSV_FILE, encoding="utf-8-sig") as bitumen_csv:
    bitumen_reader = csv.DictReader(bitumen_csv)
    for port in bitumen_reader:
          ports.add(port['ORIGIN PORTS'])

print(ports)

Ports, a set, is an iterable, or just convert to a list if you need, list(ports).

  • Related