I am trying to sort a pandas dataframe. Each cell is dict of two values. Example for one column:
0 {'lat': 50.7392927, 'lon': 7.0950485}
1 {'lat': 51.423369, 'lon': 7.1495216}
2 {'lat': 50.7385629, 'lon': 7.0938597}
3 {'lat': 50.7394781, 'lon': 7.1001448}
4 {'lat': 52.2092612, 'lon': 8.7446132}
I am trying to sort each column by iterating through the frame, however when I even try to sort a single column with this code:
sorted(df["name"], key=lambda d:d["lat"])
I get the error: 'float' object is not subscriptable.
EDIT: The output I want to achieve is a column which is sorted by the values of the key "lat":
0 {'lat': 50.7385629, 'lon': 7.0938597}
1 {'lat': 50.7392927, 'lon': 7.0950485}
2 {'lat': 50.7394781, 'lon': 7.1001448}
3 {'lat': 51.423369, 'lon': 7.1495216}
4 {'lat': 52.2092612, 'lon': 8.7446132}
My guess is that this only returns a single float instead of a list of floats, which could be sorted. I could of course just iterate through the whole Dataframe an construct a list of each column to sort it, but I thought there might be a better and faster solution to this.
Best regards
CodePudding user response:
If you have this DataFrame:
name position
0 john {'lat': 50.7392927, 'lon': 7.0950485}
1 rick {'lat': 51.423369, 'lon': 7.1495216}
2 jenny {'lat': 50.7385629, 'lon': 7.0938597}
3 mick {'lat': 50.7394781, 'lon': 7.1001448}
4 peter {'lat': 52.2092612, 'lon': 8.7446132}
Then you can sort by "position"
column - "lat"
key by executing (assuming you dave Python dictionaries in the column, not strings):
df = df.sort_values(by="position", key=lambda k: k.str["lat"])
print(df)
Prints:
name position
2 jenny {'lat': 50.7385629, 'lon': 7.0938597}
0 john {'lat': 50.7392927, 'lon': 7.0950485}
3 mick {'lat': 50.7394781, 'lon': 7.1001448}
1 rick {'lat': 51.423369, 'lon': 7.1495216}
4 peter {'lat': 52.2092612, 'lon': 8.7446132}
Another method:
print(df.iloc[sorted(df.index, key=lambda k: df.loc[k, "position"]["lat"])])
CodePudding user response:
Maybe you can also transpose
the data frame and sort by the 'lat' columns by using sort_values
(optional: and apply transpose
to return to original)
Code:
#given dictionaries
a = {'lat': 50.7392927, 'lon': 7.0950485}
b = {'lat': 51.423369, 'lon': 7.1495216}
c = {'lat': 50.7385629, 'lon': 7.0938597}
d = {'lat': 50.7394781, 'lon': 7.1001448}
e = {'lat': 52.2092612, 'lon': 8.7446132}
df = pd.DataFrame({'1':pd.Series(a),
'2':pd.Series(b),
'3':pd.Series(c),
'4':pd.Series(d),
'5':pd.Series(e)})
#transposing rows and columns
df= df.transpose()
#sorting values by lat columns
df.sort_values('lat', ascending= False)
Result: