I have a csv file with the columns: Name, Height, City Now I need to return all the heights corresponding to similar cities. So I have created a variable for all unique cities:
uniqueCity = []
for i in city:
if i not in uniqueCity:
uniqueCity.append(i)
I am able to print all heights corresponding to each city, but I cant seem to sort them on the height value per city
def printCity(city):
for i in uniqueCity:
print(i)
for j in range(len(city)):
if i == city[j]:
print(name[j], height[j])
What am I missing?
I am not allowed to use any third party libraries.
Full code:
import csv
with open('heightData.csv', 'r') as csvfile:
csvreader = csv.reader(csvfile)
next(csvreader)
name = []
city = []
height = []
for row in csvreader:
name.append(row[0])
city.append(row[1])
height.append(int(row[2]))
city.sort()
uniqueCity = []
for i in city:
if i not in uniqueCity:
uniqueCity.append(i)
def printCity(city):
for i in uniqueCity:
print(i)
for j in range(len(city)):
if i == city[j]:
print(name[j], height[j])
printCity(city)
Sample data:
name,city,height
Mariam Cox,St_Paul,67
Daniel Ashley,St_Paul,65
Oliver Clay,Minneapolis,75
Rae Finley,Minneapolis,81
Brady Joyce,Virginia,68
Harding Jones,Virginia,80
Expected output:
Minneapolis:
Oliver Clay 75
Rae Finley 81
St_Paul:
Daniel Ashley 65
Mariam Cox 67
Virginia:
Brady Joyce 68
Harding Jones 80
CodePudding user response:
The problem is, once you separated the data into separate lists for each column, there's nothing connecting the same row for each column. Then, when you do city.sort()
, the other columns don't also get sorted, and now you have the city
column out of order with respect to the others.
Instead, you could put each row into a tuple, and add all tuples to a list. Then sort()
that list using the key
argument to select any column (in this case, select the [2]
item of each row to sort by height:
with open('heightData.csv', 'r') as csvfile:
csvreader = csv.reader(csvfile)
next(csvreader)
csvdata = []
for row in csvreader:
row[2] = int(row[2])
csvdata.append(tuple(row))
csvdata.sort(key=lambda row: row[2])
Which gives:
csvdata = [('Mariam Cox', 'St_Paul', 67),
('Brady Joyce', 'Virginia', 68),
('Oliver Clay', 'Minneapolis', 75),
('Harding Jones', 'Virginia', 80)]
From your edit, I see that you want to first group your data by city, and then print the names of people, sorted by their heights. You have two options to group your data:
- Sort by city and then use python's builtin
itertools.groupby()
import itertools
csvdata.sort(key=lambda row: row[1]) # Sort by city
grouped_rows = {k: list(v) for k, v in itertools.groupby(csvdata, key=lambda row: row[1])} # Group by city
- Create a dictionary where the keys are cities and the values are lists of rows belonging to that city.
import collections
grouped_rows = collections.defaultdict(list)
for row in csvdata:
city = row[1]
grouped_rows[city].append(row)
Then, you can iterate over either of these grouped_rows
objects, sort the lists within on the [2]
item, and print them:
for city in sorted(grouped_rows.keys()):
city_rows = sorted(grouped_rows[city], key=lambda row: row[2])
print(city)
for row in city_rows:
print("\t", row[0], row[2])
Minneapolis
Oliver Clay 75
St_Paul
Mariam Cox 67
Virginia
Brady Joyce 68
Harding Jones 80
CodePudding user response:
For the assignment, it had to be a function. But this seems to work for me.
#create tuple of all heights corresponding to each city
def heightTuple(city):
cityHeight = collections.defaultdict(list)
for i in range(len(city)):
cityHeight[city[i]].append(height[i])
for i in cityHeight:
cityHeight[i].sort()
print(cityHeight)