Home > Net >  Python: Selecting days in-between with input date by user
Python: Selecting days in-between with input date by user

Time:12-14

I am trying to take some values from a Covid database and I wrote the following code which works as I want (see below) but I have a question for you after the code:

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt


def main():

  pd.set_option('display.max_rows', None)          
  df = pd.read_csv('https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_confirmed_global.csv')

  df=df[df["Country/Region"]=="Italy"]

  df=df.drop(columns=["Province/State","Lat","Long","Country/Region"])
  df = df.columns.to_frame().T.append(df, ignore_index=True)
  df.columns = range(len(df.columns))
  df=df.T    
  df = df.rename(columns={0: 'date', 1: 'nuovi_casi'})
  df['nuovi_casi'] = df['nuovi_casi'].diff(periods=1).fillna(1)
  df = df[(df['date'] > '11/26/21') & (df['date'] <= '12/8/21')]

  print(df)

  dati_giornalieri=list(df.nuovi_casi)
  sommatoriaitalia=(sum(dati_giornalieri)/1390000000)*100
  
  print(sommatoriaitalia)
  print(dati_giornalieri)

Now I want to add this part of the code to ask the user what is the starting date and the finish date:

    def main():
      
      start_date=str(input("Enter starting date in format mm/dd/yy"))          
      end_date=str(input("Enter ending date in format mm/dd/yy"))             

      pd.set_option('display.max_rows', None)          
      df = pd.read_csv('https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_confirmed_global.csv')

      df=df[df["Country/Region"]=="Italy"]
      df=df.drop(columns=["Province/State","Lat","Long","Country/Region"])
      df = df.columns.to_frame().T.append(df, ignore_index=True)
      df.columns = range(len(df.columns))
      df=df.T    
      df = df.rename(columns={0: 'date', 1: 'nuovi_casi'})
      df['nuovi_casi'] = df['nuovi_casi'].diff(periods=1).fillna(1)
      df = df[(df['date'] > start_date) & (df['date'] <= end_date)]

but in the line df = df[(df['date'] > start_date) & (df['date'] <= end_date)] there is an error because he cannot compare date to string. I actually tried importing datetime:

start_date = datetime.strptime(input('Enter Start date in the format m/d/y'), '%m/%d/%y')

but I actually had the same result because there is still a problem because for some reason it only consider a day per month or something similar but anyway the result is not as wanted.

How to solve the problem, selecting the days in between? Thanks.

CodePudding user response:

Convert the values to datetime before comparing:

start_date = pd.to_datetime(start_date, format="%m/%d/%y")
end_date = pd.to_datetime(end_date, format="%m/%d/%y")
df["date"] = pd.to_datetime(df["date"], format="%m/%d/%y")

df = df[df["date"].between(start_date, end_date, inclusive="right")]
  • Related