Home > Mobile >  Remove redundant items from a list of dicts
Remove redundant items from a list of dicts

Time:02-01

I have a list of dicts based on user selections from a GUI (Plotly returns.) When a user clicks a data point (or group of data points), the datapoint(s) is added to the list.

However, if the user clicks the same data point (or selects a group of datapoints, which includes a datapoint already selected) then redundant dictionaries appear in the list for the redundant data point(s).

I.e.

[
  {  
    "clicked": true,
    "selected": true,
    "hovered": false,
    "x": 0,
    "y": 71100.0988957607,
    "selected_xcol": "injection_id",
    "xvalue": "e54112f9-4497-4a7e-91cd-e26842a4092f",
    "selected_ycol": "peak_area",
    "yvalue": 71100.0988957607,
    "injection_id": "e54112f9-4497-4a7e-91cd-e26842a4092f"
  },
  {
    "clicked": true,
    "selected": true,
    "hovered": false,
    "x": 0,
    "y": 75283.2386064552,
    "selected_xcol": "injection_id",
    "xvalue": "e54112f9-4497-4a7e-91cd-e26842a4092f",
    "selected_ycol": "peak_area",
    "yvalue": 75283.2386064552,
    "injection_id": "e54112f9-4497-4a7e-91cd-e26842a4092f"
  },
  {  # Redundant, same as first item
    "clicked": true,
    "selected": true,
    "hovered": false,
    "x": 0,
    "y": 71100.0988957607,
    "selected_xcol": "injection_id",
    "xvalue": "e54112f9-4497-4a7e-91cd-e26842a4092f",
    "selected_ycol": "peak_area",
    "yvalue": 71100.0988957607,
    "injection_id": "e54112f9-4497-4a7e-91cd-e26842a4092f"
  }
]

Because users can select one or multiple datapoints in one GUI stroke, and the code doesn't know which, I simply add the returned list to the cumulative list like so...

LOCAL["selected_data"]  = selectable_data_chart(LOCAL["df"], 
                                               key = "st_react_plotly_control_main_chart",
                                               custom_data_columns = custom_data_columns, 
                                               hovertemplate = hovertemplate, 
                                               svgfilename = svgfilename)

I have tried filtering out the redundant items with ...

LOCAL["selected_data"] = list(set(LOCAL["selected_data"]))

...but it raises an error...

TypeError: unhashable type: 'dict'

I have also tried...

result = []
LOCAL["selected_data"] = [result.append(d) for d in LOCAL["selected_data"] if d not in result]  

...but it returns null no matter what.

[
  null,
  null
] 

CodePudding user response:

You can't add a mutable value to a set (or use it as a dictionary key)...what if, after adding an item to a set, you changed the values so that it was identical to another set member? That would invalidate the guarantees provided by the set data type.

One possible solution is to transform your dictionaries into a structured type. For example, using the dataclasses module, we could write (assuming that your sample data is contained in the file data.json):

import json
import dataclasses


@dataclasses.dataclass(frozen=True)
class Event:
    clicked: bool
    selected: bool
    hovered: bool
    x: float
    y: float
    selected_xcol: str
    xvalue: float
    selected_ycol: str
    yvalue: float
    injection_id: str


with open("data.json") as fd:
    data = json.load(fd)

events = set(Event(**item) for item in data)

As @lemon pointed out in a comment, this won't actually work for the sample data in your question, because the third item in the list is not identical to the first item (in the first item, x=0, but in the third item, x="e54112f9-4497-4a7e-91cd-e26842a4092f"). If this was just a typo when entering your question, the solution here will work just fine.


A less structured solution would be to transform each dictionary into a list of tuples using the items() method, turn that into a tuple, and then add those to your "unique" set:

import json

with open("data.json") as fd:
    data = json.load(fd)

events = set(tuple(item.items()) for item in data)

In this case, events is a set of tuples; you could transform it back into a list of dictionaries like this:

dict_events = [dict(item) for item in events]
  • Related