Hi all: this is maybe a simple task but I cannot understand how to write it. I have the following dataframe:
df = pd.read_json('https://raw.githubusercontent.com/pcm-dpc/COVID-19/master/dati-
json/dpc-covid19-ita-province.json',
convert_dates =['data'])
df.index = df['data']
df.index = df.index.normalize()
df = df[df["sigla_provincia"] == 'VR']
df['totale_casi'] = df['totale_casi'] 1
ts = df[['totale_casi']].dropna()
sts = ts.totale_casi
I understand that if I write "df['totale_casi'] = df['totale_casi'] 1" I simply add 1 to every value of the column 'totale_casi' and this is simple.
But if you look at the url GITHUBLINK you may see that for every province of Italy I have for every day the TOTAL number of covid cases (the target province is Verona btw) which is good but I want to build a dataframe that contains for every day the difference between 'totale_casi'of today and 'totale_casi' of yesterday, something like this (pseudocode)
df['totale_casi'] = df['totale_casi'][today] - df['totale_casi'][yesterday]
for each day of the json. How to solve the task? Many thanks in advance.
CodePudding user response:
df['totale_casi'] = df['totale_casi'].diff(periods=1)
CodePudding user response:
From what I see, the dataset looks like it is already temporally ordered. So I think the simplest solution is:
df['totale_casi'] = df['totale_casi'].diff()
I hope I have helped you!