Home > database >  PANDAS: math divide in dataframe between the rows that are separated by 7 spaces
PANDAS: math divide in dataframe between the rows that are separated by 7 spaces

Time:10-31

I have the following code that takes data from a Github

df = pd.read_json('https://raw.githubusercontent.com/pcm-dpc/COVID-19/master/dati-  json/dpc-covid19-ita-regioni.json',convert_dates =['data']) 
df.index = df['data']
df.index = df.index.normalize()
df = df[df["denominazione_regione"] == 'Veneto']

After that (and other drops) the df looks like this: ("data" means "date" and "totale_positivi" means "total positive")

data              totale_positivi
2021-09-18        2
2021-09-19        5
2021-09-20        10
2021-09-21        20
2021-09-22        30
2021-09-23        40
2021-09-24        50   
2021-09-25        60 
2021-09-27        80
2021-09-28        100

now I need to transform this dataframe into another that has, for every date, the ratio between the value of the date and the value of 7 days before, starting this operation from the LATEST value as shown (if a value cannot do the ratio, simply put that value equal to 1):

data              totale_positivi
2021-09-18        1
2021-09-19        1   <--- this has no value to do the ratio, so =1 by default
2021-09-20        1   <--- this has no value to do the ratio, so =1 by default
2021-09-21        1   <--- this has no value to do the ratio, so =1 by default
2021-09-22        1   <--- this has no value to do the ratio, so =1 by default
2021-09-23        1   <--- this has no value to do the ratio, so =1 by default
2021-09-24        1   <--- this has no value to do the ratio, so =1 by default
2021-09-25        30   <--- this is 60/2 (2 is 7 days before 60)
2021-09-27        16   <--- this is 80/5 (5 is 7 days before 80)
2021-09-28        10  <--- this is 100/10 (10 is 7 days before 100)

I tried this:

cera=len(list(df['totale_positivi']))
for i in range (0,cera):
while(i>7):
 df.loc['totale_positivi'] = df.loc[cera-i] / df.loc[cera-i-7] 

but it doesn't work. I also tried this:

df['totale_positivi']=df['totale_positivi'].div(periods=7)

but doesn't work.

How to solve? Thanks

CodePudding user response:

Do:

df['totale_positivi'] = (df['totale_positivi'] / df['totale_positivi'].shift(7)).fillna(1)
  • Related