I currently have a table called Sales
. The Sales
table has a column called sale_date
which is in the form YYYY-MM-DD
and I want to extract rows where the month is within a range.
| seller_id | product_id | buyer_id | sale_date | quantity | price |
|-----------|------------|----------|--------------|----------|-------|
| 7 | 11 | 49 | '2019-01-21' | 5 | 3330 |
| 13 | 32 | 6 | '2019-02-10' | 9 | 1089 |
| 50 | 47 | 4 | '2019-01-06' | 1 | 1343 |
I've tried something like:
>>> df.loc[df['sale_date'].str.split('-').isin([1, 2, 3])]
>>> df.loc[[int(x[1]) for x in df['sale_date'].str.split('-')][1] in [1, 2, 3]]
but these result in a type error and key error, respectively.
Is there any way that I can extract just the month from the sale_date
column and check whether it's in a range? Thanks.
CodePudding user response:
You can convert values to datetimes and then extract months:
df.loc[pd.to_datetime(df['sale_date']).dt.month.isin([1, 2, 3])]
Or modify your solution with extract second values from list by indexing str[1]
with casting to integers:
df.loc[df['sale_date'].str.split('-').str[1].astype(int).isin([1, 2, 3])]