I have a csv file with people's names and averages as below:
mandana,7.5
hamid,6.066666666666666
sina,11.285714285714286
sara,9.75
soheila,7.833333333333333
ali,5.0
sarvin,11.375
I want to sort it by the averages and write it into another file. I've tried lambda and itemgetter but I didn't get the proper result. Here is my code:
def calculate_sorted_averages(file1, file2):
with open (r'C:\Users\sony\Desktop\Python with Jadi\file1.csv', 'r') as f1:
reader=csv.reader(f1)
d={}
for row in reader:
name=row[0]
average=row[1]
d[name]=average
sorted_dict=OrderedDict(sorted(d.items(), key=operator.itemgetter(1), reverse=True))
with open (r'C:\Users\sony\Desktop\Python with Jadi\file2.csv', 'w', newline='') as f2:
for key in sorted_dict.keys():
writer=csv.writer(f2)
writer.writerow([key,sorted_dict[key]])
And here is my output:
sara,9.75
soheila,7.833333333333333
mandana,7.5
hamid,6.066666666666666
ali,5.0
sarvin,11.375
sina,11.285714285714286
As you can see it is not sorted. I've tried also lambda and it didn't work. I'm now frustrated and don't know what to do. Can anyone help me? Thanks.
CodePudding user response:
You got your result because you're sorting lexicographically (comparing your floats as strings) instead of sorting by their numeric value.
All you're missing is casting the numeric value to float
and you're done, and sort as usual with key=operator.itemgetter(1)
def calculate_sorted_averages(file1, file2):
d = {}
with open (r'path/to/unsorted.csv', 'r') as f1:
reader=csv.reader(f1)
for row in reader:
name=row[0]
average=row[1]
d[name]=float(average)
sorted_dict=OrderedDict(sorted(d.items(), key=operator.itemgetter(1), reverse=True))
with open (r'path/to/sorted.csv', 'w', newline='') as f2:
for key in sorted_dict.keys():
writer=csv.writer(f2)
writer.writerow([key,sorted_dict[key]])
CodePudding user response:
aaa = {'0': ['mandana', 7.5], '1': ['hamid', 6.066666666666666], '2': ['sina', 11.285714285714286], '3': ['sara', 9.75],
'4': ['soheila', 7.833333333333333], '5': ['ali', 5.0], '6': ['sarvin', 11.375]}
sorted_ = sorted(aaa.items(), key=lambda x: x[1][1])
sorted_ = dict(sorted_)
Output
{'5': ['ali', 5.0], '1': ['hamid', 6.066666666666666], '0': ['mandana', 7.5], '4': ['soheila', 7.833333333333333], '3': ['sara', 9.75], '2': ['sina', 11.285714285714286], '6': ['sarvin', 11.375]}
You didn't show the entire dictionary with the keys. So I created my 'aaa'. Sorting takes place by the second element.
CodePudding user response:
By default, text read from a file, with or without csv.reader
, is stored into strings. You need to call float
on the second element of each row, to interpret it as a floating-point number.
I think using an OrderedDict
is a bit overkill here. One call to sorted
is enough.
import csv
def calculate_sorted_averages(filename_input, filename_output):
with open(filename_input, 'r') as f1:
reader=csv.reader(f1)
sorted_rows = sorted(reader, key=lambda x: float(x[1]))
with open(filename_output, 'w') as f2:
writer = csv.writer(f2)
writer.writerows(sorted_rows)
calculate_sorted_averages('file1.csv', 'file2.csv')
Results:
$ cat file1.csv
mandana,7.5
hamid,6.066666666666666
sina,11.285714285714286
sara,9.75
soheila,7.833333333333333
ali,5.0
sarvin,11.375
$ cat file2.csv
ali,5.0
hamid,6.066666666666666
mandana,7.5
soheila,7.833333333333333
sara,9.75
sina,11.285714285714286
sarvin,11.375
CodePudding user response:
You can try the pandas
module for this.
The pandas.read_csv()
function would read the csv file whose path you pass in as a parameter inside the function, and would convert it into a pandas
dataframe or in simpler words it would display a table inside Python.
import pandas as pd
df = pd.read_csv("C:\Users\sony\Desktop\Python with Jadi\file1.csv")
df.columns = ["Name", "Value"] # To set the column names. Only do this if the dataframe doesn't already have a column name.
sorted_df = df.sort_values(by = "Value") # Sorting the dataframe by the values in the "Value" column
Output -
Name | Value | |
---|---|---|
5 | ali | 5.0 |
1 | hamid | 6.066666666666666 |
0 | mandana | 7.5 |
4 | soheila | 7.833333333333333 |
3 | sara | 9.75 |
2 | sina | 11.285714285714286 |
6 | sarvin | 11.375 |
You can convert this dataframe back to a csv file using to_csv()
. Pass in the file path as the parameter and set index = False
if you don't want the index to be added as a column.
CodePudding user response:
Pandas can be used for this - you can install it with pip install pandas
import pandas as pd
df = pd.read_csv('filename.csv')
df.columns = ['name', 'value']
df.sort_values('value', inplace=True, ascending=True)
print(df)