Home > database >  Take user input for number of months back and filter data based on that date
Take user input for number of months back and filter data based on that date

Time:08-31

I am working on a problem where I have to take user input which is an integer indicating the number of months I have to look back at. For example if I want to look at the data 3 months back I must take input from user as 3. Based on this integer user input I have to filter my dataset. For example today's date is 8/30/2022 so 3 months back will be 5/30/2022. Now I want to filter my dataframe to include only those rows for this date which is 3 months back i.e. 5/30/2022

I tried using datetime and relativetime libraries but nothing seems to work for me.

Below is an example of my dataframe:

id text1 text2 date 1 Ram patient 5/30/2022 10:22:00 2 John patient 5/30/2022 11:45:08 3 Rich child 5/28/2022 10:45:13

so I want output to be rows corresponding to 1 and 2

enter image description here

CodePudding user response:

Please try to show your attempt in future, this feels like I'm doing your homework, but hopefully, this gives you an idea.

import pandas as pd
df = pd.DataFrame({'id': [1, 2, 3], 'text1': ['Ram', 'John', 'Rich'], 'text2': ['patient', 'patient', 'child'], 'date': ['5/30/2021 10:22:00', '5/30/2022 11:45:08', '5/28/2022 10:45:13']})

user_input = int(input("Enter the number of months to look back: "))
#convert the date column to datetime
df['date'] = pd.to_datetime(df['date'])
#set the date column as index
df = df.set_index('date')
#sort the index
df = df.sort_index()
#filter the dataframe to get the rows which are within the last x months
df = df[df.index >= pd.Timestamp.today() - pd.DateOffset(months=user_input)]
#convert the index back to a column
df.reset_index(inplace=True)
#print the filtered dataframe
print(df)

Input:

Enter the number of months to look back: 12

Output:

                 date  id text1    text2
0 2022-05-28 10:45:13   3  Rich    child
1 2022-05-30 11:45:08   2  John  patient

CodePudding user response:

You can use the DateOffset function in Pandas. See the documentation for more details.

An example of this is below which assumes you have a data frame with a date column:

num_months = int(input('Please enter the number of months to look back: '))

df['date'] = pd.to_datetime(df['date'])
past_date = pd.to_datetime('today') - pd.DateOffset(months=num_months)
df = df[df['date'] >= past_date]

print(df)

The above will filter the data frame on the date column leaving only those rows where the date is on or after the calculated date i.e. today's date minus the specified number of months.

  • Related