Home > front end >  Sort a pandas dataframe with dicts by values
Sort a pandas dataframe with dicts by values

Time:04-16

I am trying to sort a pandas dataframe. Each cell is dict of two values. Example for one column:

0        {'lat': 50.7392927, 'lon': 7.0950485}
1        {'lat': 51.423369, 'lon': 7.1495216}
2        {'lat': 50.7385629, 'lon': 7.0938597}
3        {'lat': 50.7394781, 'lon': 7.1001448}
4        {'lat': 52.2092612, 'lon': 8.7446132}

I am trying to sort each column by iterating through the frame, however when I even try to sort a single column with this code:

sorted(df["name"], key=lambda d:d["lat"])

I get the error: 'float' object is not subscriptable.

EDIT: The output I want to achieve is a column which is sorted by the values of the key "lat":

0        {'lat': 50.7385629, 'lon': 7.0938597}
1        {'lat': 50.7392927, 'lon': 7.0950485}
2        {'lat': 50.7394781, 'lon': 7.1001448}
3        {'lat': 51.423369, 'lon': 7.1495216}
4        {'lat': 52.2092612, 'lon': 8.7446132}

My guess is that this only returns a single float instead of a list of floats, which could be sorted. I could of course just iterate through the whole Dataframe an construct a list of each column to sort it, but I thought there might be a better and faster solution to this.

Best regards

CodePudding user response:

If you have this DataFrame:

    name                               position
0   john  {'lat': 50.7392927, 'lon': 7.0950485}
1   rick   {'lat': 51.423369, 'lon': 7.1495216}
2  jenny  {'lat': 50.7385629, 'lon': 7.0938597}
3   mick  {'lat': 50.7394781, 'lon': 7.1001448}
4  peter  {'lat': 52.2092612, 'lon': 8.7446132}

Then you can sort by "position" column - "lat" key by executing (assuming you dave Python dictionaries in the column, not strings):

df = df.sort_values(by="position", key=lambda k: k.str["lat"])
print(df)

Prints:

    name                               position
2  jenny  {'lat': 50.7385629, 'lon': 7.0938597}
0   john  {'lat': 50.7392927, 'lon': 7.0950485}
3   mick  {'lat': 50.7394781, 'lon': 7.1001448}
1   rick   {'lat': 51.423369, 'lon': 7.1495216}
4  peter  {'lat': 52.2092612, 'lon': 8.7446132}

Another method:

print(df.iloc[sorted(df.index, key=lambda k: df.loc[k, "position"]["lat"])])

CodePudding user response:

Maybe you can also transpose the data frame and sort by the 'lat' columns by using sort_values (optional: and apply transpose to return to original)

Code:

#given dictionaries
a = {'lat': 50.7392927, 'lon': 7.0950485}
b = {'lat': 51.423369, 'lon': 7.1495216}
c = {'lat': 50.7385629, 'lon': 7.0938597}
d = {'lat': 50.7394781, 'lon': 7.1001448}
e = {'lat': 52.2092612, 'lon': 8.7446132}

df = pd.DataFrame({'1':pd.Series(a),
              '2':pd.Series(b),
              '3':pd.Series(c),
              '4':pd.Series(d),
              '5':pd.Series(e)})

#transposing rows and columns
df= df.transpose()

#sorting values by lat columns
df.sort_values('lat', ascending= False)

Result:

Sorted result

  • Related