Home > database >  How to remove a dict from a list of dicts based on whether a value is more than X months old
How to remove a dict from a list of dicts based on whether a value is more than X months old

Time:06-23

I have the following code which removes a dict from a list of dicts in Python based on whether the Date key has a value greater than 180 days ago (ideally needs to be 6 months):

gp_clinicals_meds_repeat = session["gp_clinicals_meds_repeat"]
for i in range(len(gp_clinicals_meds_repeat["GpRepeatMedicationsList"])):
    date_object = parser.parse(gp_clinicals_meds_repeat["GpRepeatMedicationsList"][i]["Date"])
    months_between = datetime.now() - date_object
    if months_between.days > 180:
        del gp_clinicals_meds_repeat["GpRepeatMedicationsList"][i]

An example of my JSON is below (just has one entry but could have hundreds):

{
    "GpRepeatMedicationsList": [{
        "Constituent": "",
        "Date": "2021-07-15T00:00:00",
        "Dosage": "0.6ml To Be Taken Each Day",
        "LastIssuedDate": "2021-07-15T00:00:00",
        "MixtureId": "",
        "Quantity": "50",
        "ReadCode": "DADR8795BRIDL",
        "Rubric": "Dalivit oral drops (Dendron Brands Ltd)",
        "TenancyDescription": "Orglinks",
        "Units": "ml"
    }],
    "TotalItemCount": 1
}

I was thinking list comprehension but not sure how to parse the string as a date within it.

My code does not work correctly if it needs to remove two elements in a row, since it will always increment i, regardless of whether it just removed the element at index i. Also, it will keep running until the end of the original length, so if you remove any elements, this code will end with an exception because gp_clinicals_meds_repeat["GpRepeatMedicationsList"][i] will no longer exist for the later values of i.

Any suggestions?

CodePudding user response:

You can use a list comprehension with an if to easily do this. I separated the criteria in a function since it may be a bit more complicated. I also recommend using pandas.Timestamp to handle dates, as it is very robust:

import pandas as pd

def is_recent(entry):
    date_object = pd.to_datetime(entry["Date"])
    days_between = pd.Timestamp.today() - date_object
    return days_between < pd.Timedelta(days=180)

original_clinicals = gp_clinicals_meds_repeat["GpRepeatMedicationsList"]
recent_clinicals = [entry for entry in original_clinicals if is_recent(entry)]
gp_clinicals_meds_repeat["GpRepeatMedicationsList"] = recent_clinicals  # Replace the original list

To get 6 months instead of 180 days, you can use dateutil.relativedeltas. The is_recent function can be changed like (you could add a parameter to allow a configurable number of months).

import pandas as pd
import dateutil.relativedelta as relativedelta

def is_recent(entry):
    limit_time = pd.Timestamp.today() - relativedelta.relativedelta(months=6)
    return pd.to_datetime(entry["Date"]) > limit_time

original_clinicals = gp_clinicals_meds_repeat["GpRepeatMedicationsList"]
recent_clinicals = [entry for entry in original_clinicals if is_recent(entry)]
  • Related