Home > Mobile >  filter records retain the most recent record in a group
filter records retain the most recent record in a group

Time:04-06

I would to retain the most recent episodes based on the dates. For example, for s001 I would to retain the record with 2022-04-05 since it is more recent than the other one

import pandas as pd
import datetime

record = ['s001', 's002', 's003', 's002', 's004', 's003',
          's004', 's001', 's004', 's003', 's002', 's005']

base = datetime.date.today()
date_list = [base - datetime.timedelta(days=x) for x in range(len(record))]

df = pd.DataFrame({
    "id": record,
    "date_visited": date_list
})

print(df.sort_values('id'))

CodePudding user response:

df.sort_values('date_visited').drop_duplicates(['id'], keep='last')

see here and here

  • Related