Home > Back-end >  Count rows in a dataframe where date is in the past 7 days
Count rows in a dataframe where date is in the past 7 days

Time:12-03

I have this dataframe with a "date" column in it. I want to count the rows where the date is in the past 7 days. What's the best way to do this? I feel like using an If and a counter isn't very pandas-esque.

Also, I'm importing the data from a SQL db. Should I just load it already filtered with a query? What's the most efficient way?

CodePudding user response:

My suggestion:

  1. First, calculate the interval datetimes: today and past 7 days.
import datetime

today = datetime.date.today()
past7 = today - datetime.timedelta(days=7)
  1. Use them to filter your dataframe:
df_filtered = df[(df['date'] >= past7) & (df['date'] <= today)]
  1. Get the df_filtered length:
print(len(df_filtered))

CodePudding user response:

Consider your dataframe is something like that:

df = pd.DataFrame({'date': ['2021-12-03', '2021-12-02', '2021-12-01', '2021-11-30'], 'data': [1, 2, 3, 4]})
   date           data
0  2021-12-03     1
1  2021-12-02     2
2  2021-12-01     3
3  2021-11-30     4

if you want to filter the data between dates 2021-11-30 and 2021-12-02, you can use the following command:

df_filtered = df.set_index('date').loc['2021-12-02':'2021-11-30'].reset_index()
   date           data
0  2021-12-02     2
1  2021-12-01     3
2  2021-11-30     4

In the first step, you set the date to the index and after that use .loc method to filter your desired dates. In the final step, you can count the number of rows by using the len(df_filtered)

CodePudding user response:

try:

len(df[df['date'] > datetime.datetime.now() - pd.to_timedelta("7day")])
  • Related