I am trying to sort a nested list. I have an input file like this:
mandana,5,7,3,15
ali,19,10,19,6,8,14,3
hamid,3,9,4,20,9,1,8,16,0,5,2,4,7,2,1
sohrab,19,10,19,6,8,14,3
sara,0,5,20,14
soheila,13,2,5,1,3,10,12,4,13,17,7,7
nahid,13,2,5,1,3,10,12,4,13,17,7,7
ali,1,9
sarvin,0,16,16,13,19,2,17,8
sheyda,0,16,16,13,19,2,17,8
when I sort with function
def calculate_sorted_averages(input_file_name,output_file_name):
with open (input_file_name) as f:
reader=csv.reader(f)
list1=list()
for row in reader:
name=row[0]
these_grade=list()
for grade in row[1:]:
these_grade.append(float(grade))
avg1=mean(these_grade)
list1.append([name,avg1])
print(list1)
**list1.sort(key=lambda x: (int(x[1]), x[0]))**
print(list1)
with open (output_file_name,'w',newline='') as outp:
writer = csv.writer(outp)
for item in list1:
writer.writerow(item)
outp.close()
output file is:
ali,5.0
hamid,6.066666666666666
mandana,7.5
nahid,7.833333333333333
soheila,7.833333333333333
sara,9.75
ali,11.285714285714286
sarvin,11.375
sheyda,11.375
sohrab,11.285714285714286
Sohrab is not true in sorting. but when I Change Sohrab'name for example Nima, sorting is true. How can I fix this problem?
CodePudding user response:
Your sort key is a tuple:
lambda x: (int(x[1]), x[0])
By converting to an int
you're making several values have the same primary key, which means they get sorted by their secondary key, their name. That is, you generate four tuples:
(11, "ali")
(11, "sarvin")
(11, "sheyda")
(11, "sohrab")
that all have the same primary value (11), so they get sorted alphabetically as the secondary key.
Removing the int()
and sorting by (x[1], x[0])
should give you what you want.
CodePudding user response:
The below seems to work. You split the data by new line, each line you split to the name and the numbers. You calculate the mean for each person and accumulate the pair person mean in a list. Last step is to sort the pair by the mean value.
import statistics
data = '''mandana,5,7,3,15
ali,19,10,19,6,8,14,3
hamid,3,9,4,20,9,1,8,16,0,5,2,4,7,2,1
sohrab,19,10,19,6,8,14,3
sara,0,5,20,14
soheila,13,2,5,1,3,10,12,4,13,17,7,7
nahid,13,2,5,1,3,10,12,4,13,17,7,7
ali,1,9
sarvin,0,16,16,13,19,2,17,8
sheyda,0,16,16,13,19,2,17,8'''
lines = data.split('\n')
new_lines = []
for line in lines:
fields = line.split(',')
new_lines.append((fields[0],statistics.mean(int(x) for x in fields[1:])))
new_lines = sorted(new_lines,key= lambda x: x[1])
for line in new_lines:
print(line)
output
('ali', 5)
('hamid', 6.066666666666666)
('mandana', 7.5)
('soheila', 7.833333333333333)
('nahid', 7.833333333333333)
('sara', 9.75)
('ali', 11.285714285714286)
('sohrab', 11.285714285714286)
('sarvin', 11.375)
('sheyda', 11.375)