I have a CSV file with many columns and rows, and I wanted to create a dictionary from one specific row (row 12 as shown in code), that have duplicates. I have managed to do this, however, I cannot figure out how to sort it. I tried sorting it before creating the dictionary, and after. I use python through VSCode.
This is my code:
import csv
with open("FILENAME", newline="", encoding="iso-8859-1") as csvfile:
reader = csv.reader(csvfile, delimiter=";")
next(reader)
species_count ={}
for row in reader:
species = row[12]
species_count[species] = species_count.get(species,0) 1
for num in species_count:
print(f"{num}: {species_count[num]}")
Example of current result:
A: 5922
C: 5837
D: 6136
B: 12
E: 1
etc.
Is there an easy way of doing this? Any help is appreciated! (I am a beginner)
Edit: I want to sort it alphabetically, so:
A: 5922
B: 12
C: 5837
etc
CodePudding user response:
Here's source
sortedDict = dict(sorted(species_count.items(), key = lambda x: x[0]))
print(sortedDict)
Result
{'A': 5922, 'B': 12, 'C': 5837, 'D': 6136, 'E': 1}
CodePudding user response:
Sorting the elements in the dict
Your code is pretty good, you just need to replace the final display loop
for num in species_count:
with a loop on a sorted version of the content of species_count
.
Conveniently, getting a sorted version of a collection can be done easily with builtin function sorted
. You can sort the list of keys of the dictionary, or you can sort directly the list of pairs (key, value).
# version 1
for num in sorted(species_count):
print(f"{num}: {species_count[num]}")
# version 2
for s,c in sorted(species_count.items()):
print(f'{s}: {c}')
I prefer version 2, although it's a matter of taste. They're almost (but not quite) equivalent.
Additional comments on your code
Using collections.Counter
Your use of d[k] = d.get(k,0) 1
in a loop to build a dictionary of counts works very well. However, since this is a very classic thing to do and very useful in lots of situations, there is a subclass of dict
in python that can handle all that logic for you. The subclass is called Counter
and is found in module collections
. Using that class, the dictionary of counts can be built in one simple line of code:
from collections import Counter
species_count = Counter(row[12] for row in reader)
Opening and closing files
The great advantage of using a with
block like you did is that it takes care of closing the file for you. The file is closed when the with
block ends. In my opinion, you should close the with
block as soon as you have finished reading from the file. So, the final display loop for num in sorted(species_count):
should be outside the with
block.
Final code
import csv
from collections import Counter
with open("FILENAME", newline="", encoding="iso-8859-1") as csvfile:
reader = csv.reader(csvfile, delimiter=";")
next(reader)
species_count = Counter(row[12] for row in reader)
for s,c in sorted(species_count.items()):
print(f'{s}: {c}')