So my task is pretty simple. We have a .CSV file with the results of the decathlon competition. They need to be changed into tasks, ranked and assigned places. Everything works fine apart from one line:
modified_data.sort_values(by=["Total points"])
Why doesn't it sort the result for me?
My work below:
import pandas as pd
import numpy as np
# Modification of CSV file data by adding header names and splitting data
data = pd.read_csv("static/data/Decathlon.csv", delimiter=';', header=None)
data = data.assign(Total_points=0)
data = data.assign(Ranking=0)
header_list = ['Player', '100 metres', 'Long jump', 'Short put', 'High jump', '400 metres', '110 metres hurdles',
'Discus throw', 'Pole vault', 'Javelin throw', '1500 metres', 'Total points', 'Ranking']
data.to_csv("static/data/Decathlon_modified.csv", header=header_list, index=False)
modified_data = pd.read_csv("static/data/Decathlon_modified.csv", delimiter=',')
print(modified_data)
# Conversion of CSV data into the necessary units of measurement,
# so that it can be applied to the calculation of the resulting formulas:
temporary_list = []
changed_list = []
for time in modified_data["1500 metres"]:
temporary_list.append(time.split('.'))
for new_value in temporary_list:
value = (int(new_value[0]) * 60) int(new_value[1]) int(new_value[2]) * 0.01
changed_list.append(value)
for index, new_value in enumerate(changed_list):
modified_data.loc[index, "1500 metres"] = new_value
# Results are calculated according to formulas:
# Points = INT(A(B — P)C) for track events (faster time produces a higher score)
modified_data["100 metres"] = round((25.4347 * (18 - modified_data["100 metres"]) ** 1.81))
modified_data["400 metres"] = round(1.53775 * (82 - modified_data["400 metres"]) ** 1.81)
modified_data["110 metres hurdles"] = round(5.74352 * (28.5 - modified_data["110 metres hurdles"]) ** 1.92)
modified_data["1500 metres"] = round(0.03768 * (480 - modified_data["1500 metres"].astype(float)) ** 1.85)
# Points = INT(A(P — B)C) for field events (greater distance or height produces a higher score)
modified_data["Long jump"] = round(0.14354 * ((modified_data["Long jump"] * 100) - 220) ** 1.4)
modified_data["Short put"] = round(51.39 * (modified_data["Short put"] - 1.5) ** 1.05)
modified_data["High jump"] = round(0.8465 * ((modified_data["High jump"] * 100) - 75) ** 1.42)
modified_data["Discus throw"] = round(12.91 * (modified_data["Discus throw"] - 4) ** 1.1)
modified_data["Pole vault"] = round(0.2797 * (modified_data["Pole vault"] * 100 - 100) ** 1.35)
modified_data["Javelin throw"] = round(10.14 * (modified_data["Javelin throw"] - 7) ** 1.08)
# Total calculation and rewriting of each player's result in a common table
total_points = modified_data["100 metres"] modified_data["Long jump"] modified_data["Short put"] \
modified_data["High jump"] modified_data["400 metres"] modified_data["110 metres hurdles"] \
modified_data["Discus throw"] modified_data["Pole vault"] modified_data["Javelin throw"] \
modified_data["1500 metres"]
for index, new_value in enumerate(total_points):
modified_data.loc[index, "Total points"] = new_value
# Ranking according to collected points
modified_data.reset_index(drop=False)
modified_data.index = np.arange(1, len(modified_data) 1)
# TODO
modified_data.sort_values(by=["Total points"])
print(modified_data)
modified_data["Ranking"] = modified_data["Total points"]. \
apply(lambda score:
modified_data.index[modified_data["Total points"] == score].astype(str)).str.join("-")
print(modified_data)
modified_data.to_json(r'static/data/Decathlon.json')
I tried:
modified_data["Total points"] = modified_data["Total points"].astype(int)
modified_data.sort_values(by=["Total points"])
AND
modified_data["Total points"] = modified_data["Total points"].astype(int)
modified_data.sort_values('Total points')
Also this: (https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.sort_values.html)
CodePudding user response:
You should use inplace = True
or assign the dataframe to the same variable:
modified_data.sort_values(by=["Total points"], inplace=True)
# Or alternatively
modified_data = modified_data.sort_values(by=["Total points"])