I have a pandas dataset that looks at the number of n
cases of an instance over time.
I have sorted the dataset in ascending order from the first recorded date and have created a new column called 'change'.
I am unsure however how to take the data from column n
and map it onto the 'change' column such that each cell in the 'change' column represents the difference from the previous day.
For example, if on day 334 there were n = 14000
and on day 335 there were n = 14500
cases, in that corresponding 'change' cell I would want it to say '500'.
I have been trying things out for the past couple of hours but to no avail so have come here for some help.
I know this is wordier than I would like, but if you need any clarification let me know.
CodePudding user response:
import pandas as pd
df = pd.DataFrame({
'date': [1,2,3,4,5,6,7,8,9,10],
'cases': [100, 120, 129, 231, 243, 212, 375, 412, 440, 1]
})
df['change'] = df.cases.diff()
OUTPUT
date cases change
0 1 100 NaN
1 2 120 20.0
2 3 129 9.0
3 4 231 102.0
4 5 243 12.0
5 6 212 -31.0
6 7 375 163.0
7 8 412 37.0
8 9 440 28.0
9 10 1 -439.0