I have a pandas dataset that is looking at the number of Covid cases over time.
I have sorted the dataset in ascending order from the first recorded date and have created a new column called 'change'.
I am unsure however how to take the data from the 'cases' column and map it onto the 'change' column such that each cell in the 'change' column represents the difference from the previous day.
For example, if on day 334 there were 14000 cases and on day 335 there were 14500 cases, in that corresponding 'change' cell I would want it to say '500'.
I have been trying things out for the past couple of hours but to no avail so have come here for some help.
I know this is wordier than I would like, but if you need any clarification let me know.
CodePudding user response:
import pandas as pd
df = pd.DataFrame({
'date': [1,2,3,4,5,6,7,8,9,10],
'cases': [100, 120, 129, 231, 243, 212, 375, 412, 440, 1]
})
df['change'] = df.cases.diff()
OUTPUT
date cases change
0 1 100 NaN
1 2 120 20.0
2 3 129 9.0
3 4 231 102.0
4 5 243 12.0
5 6 212 -31.0
6 7 375 163.0
7 8 412 37.0
8 9 440 28.0
9 10 1 -439.0