I would like to count the daily number of cases with the fixed 2 month inverval (e.g., Jan-Feb, Mar-Apr, May-Jun, Jul-Aug, etc.). For instance,
import pandas as pd
d1 = pd.DataFrame({'ID': ["A", "A", "A", "B", "B", "C", "C", "C", "C", "D", "D", "D"],
"date": ["2010-12-30", "2010-02-27", "2010-02-26", "2012-01-01", "2012-01-03",
"2011-01-01", "2011-01-02", "2011-01-08", "2014-02-21", "2010-08-31", "2010-08-30", "2010-09-01"]})
and the result that I would like to produce is as follows:
ID date count
0 A 2010-01_02 2
1 A 2010-11_12 1
2 B 2012-01_02 2
3 C 2011-01_02 3
4 C 2014-01_02 1
5 D 2010-07_08 2
6 D 2010_09_10 1
Do you have any ideas about how to do this? Calculating the monthly number of cases is rather stratighforward, but this issue is difficult for me. Thanks in advance!
CodePudding user response:
Use Grouper
by frequency 2 months:
d1['date'] = pd.to_datetime(d1['date'])
df = (d1.groupby(['ID', pd.Grouper(freq='2m', key='date')])
.size()
.reset_index(name='count'))
m = df['date'].dt.month
df['date'] = (df['date'].dt.year.astype(str) '-'
m.sub(1).astype(str).str.zfill(2) '_'
m.astype(str).str.zfill(2))
print (df)
ID date count
0 A 2010-01_02 2
1 A 2010-11_12 1
2 B 2012-01_02 2
3 C 2011-01_02 3
4 C 2014-01_02 1
5 D 2010-07_08 2
6 D 2010-09_10 1
CodePudding user response:
def solve(intervals): if not intervals: return 0
intervals.sort(key=lambda x: (x[0], -x[1]))
end_mx = float("-inf") ans = 0
for start, end in intervals: if end <= end_mx: ans = 1
end_mx = max(end_mx, end)
return ans
intervals = [[2, 6],[3, 4],[4, 7],[5, 5]] print(solve(intervals))