Home > Mobile >  How can I calculate the smallest difference in days between date in dictionaries that have the same
How can I calculate the smallest difference in days between date in dictionaries that have the same

Time:05-21

I have dictionaries like this :

tr1 = {'label': 'name1', 'date': '2021-09-29'}
tr2 = {'label': 'name1', 'date': '2021-08-30'}
tr3 = {'label': 'name1', 'date': '2021-09-30'}
tr4 = {'label': 'name2', 'date': '2021-06-30'}
tr5 = {'label': 'name2', 'date': '2021-05-30'}
tr6 = {'label': 'name3', 'date': '2021-06-30'}

And I want to get a list like this :

[1, 1, 1, 30, 0]

This list is the minimum gap between date in days for the dictionary that have the same label or 0 if we don't have other dictionary with the same label. I tried with dataframe, groupby and .transfrom but that doesn't work :

df_day = pd.DataFrame(sample_transaction)
df_day.date = df_day.date.apply(lambda x : 
    int(datetime.datetime.timestamp(
        datetime.datetime.strptime(x, "%Y-%m-%d"))))

group_day = df_day[['label', 'date']].groupby(['label'])
group_day.transform(
    lambda x: min([abs(a - b) if a != b else 0.0 for a in x for b in x]))

sample_transaction is just the list with the dictionaries inside, I tried to convert the date in second with timestamp and I tried to calculated with transform and lambda but I just get a list of 0.0

CodePudding user response:

IIUC, you can sort the dates per group and get the min diff:

l = [tr1, tr2, tr3, tr4, tr5, tr6]

(pd.DataFrame(l)
   .assign(date=lambda d: pd.to_datetime(d['date']))
   .groupby('label')['date']
   .transform(lambda s: s.sort_values().diff().min())
 )

Output:

0     1 days 00:00:00
1     1 days 00:00:00
2     1 days 00:00:00
3    31 days 00:00:00
4    31 days 00:00:00
5                 NaT
Name: date, dtype: object

For the exact provided format:


(pd.DataFrame(l)
   .assign(date=lambda d: pd.to_datetime(d['date']))
   .groupby('label')['date']
   .transform(lambda s: s.sort_values().diff().min().days)
   .fillna(0, downcast='infer')
   .to_list()
 )

Output:

[1, 1, 1, 31, 31, 0]
  • Related