I have a dataframe which gives me the daily quantity levels of various articles. I want to get a dataframe which gives me the quantity levels on the last day of every month of each article.
OriginaL df:
item | Date | Quantity |
---|---|---|
apple | 23/09/21 | 2143 |
bat | 21/09/2021 | 2444 |
cola | 15/09/21 | 1512 |
apple | 21/10/21 | 2906 |
bat | 4/10/21 | 2730 |
cola | 16/10/21 | 2449 |
cola | 31/12/2021 | 0 |
apple | 27/12/2021 | 1086 |
bat | 25/12/2021 | 1186 |
apple | 26/12/2021 | 1377 |
Target df:
item | Date | Quantity |
---|---|---|
cola | 31/12/2021 | 0 |
apple | 27/12/2021 | 1086 |
bat | 25/12/2021 | 1186 |
Is there any way to obtain it?
I tried group by item and date with tail() but it didn't work.
CodePudding user response:
IIUC need last values per years by Grouper
with GroupBy.tail
:
df['Date'] = pd.to_datetime(df['Date'], dayfirst=True)
df = df.groupby(['item', pd.Grouper(freq='Y', key='Date')]).tail(1)
print (df)
item Date Quantity
6 cola 2021-12-31 0
8 bat 2021-12-25 1186
9 apple 2021-12-26 1377
because per monhts output is different:
df['Date'] = pd.to_datetime(df['Date'], dayfirst=True)
df = df.groupby(['item', pd.Grouper(freq='m', key='Date')]).tail(1)
print (df)
item Date Quantity
0 apple 2021-09-23 2143
1 bat 2021-09-21 2444
2 cola 2021-09-15 1512
3 apple 2021-10-21 2906
4 bat 2021-10-04 2730
5 cola 2021-10-16 2449
6 cola 2021-12-31 0
8 bat 2021-12-25 1186
9 apple 2021-12-26 1377