Home > Software engineering >  How to change a string into a date using .query() in DataFrame?
How to change a string into a date using .query() in DataFrame?

Time:08-04

I have a DataFrame with a "date" column in datetime format. Is there a possibility to query the DataFrame to get a specific date from a string?

I'm trying this:

df.query("date == '2018-12-07'.strftime("%Y-%m-%d"))

But it doesn't seem to work.

Thank you in advance for your help.

CodePudding user response:

df = pd.DataFrame({
    'date':pd.date_range(start='2022-01-01', periods=20, freq='D'),
    'value':rng.integers(10,100, 20)
})
df
###
         date  value
0  2022-01-01     24
1  2022-01-02     78
2  2022-01-03     73
3  2022-01-04     41
4  2022-01-05     16
5  2022-01-06     97
6  2022-01-07     50
7  2022-01-08     90
8  2022-01-09     71
9  2022-01-10     80
10 2022-01-11     78
11 2022-01-12     27
12 2022-01-13     42
df.dtypes
###
date     datetime64[ns]
value             int64
dtype: object

We can just select the date directly without specifying the format.

df.query('date >= "2022-01-10"')
###
         date  value
9  2022-01-10     80
10 2022-01-11     78
11 2022-01-12     27
12 2022-01-13     42
13 2022-01-14     52
14 2022-01-15     54
15 2022-01-16     13
16 2022-01-17     59
17 2022-01-18     23
18 2022-01-19     76
19 2022-01-20     71

String date format

df2 = pd.DataFrame({
    'date':['2022-01-10','2022-01-11','2022-01-12', '2022-01-13', '2022-01-14', '2022-01-15', '2022-01-16', '2022-01-17', '2022-01-18', '2022-01-19', '2022-01-20'],
    'value':[10,20,30,40,50,60,70,80,90,100,110]
})
df2
###
          date  value
0   2022-01-10     10
1   2022-01-11     20
2   2022-01-12     30
3   2022-01-13     40
4   2022-01-14     50
5   2022-01-15     60
6   2022-01-16     70
7   2022-01-17     80
8   2022-01-18     90
9   2022-01-19    100
10  2022-01-20    110
df2.dtypes
###
date     object
value     int64
dtype: object
df2.query('date >= "2022-01-13"')
###
          date  value
3   2022-01-13     40
4   2022-01-14     50
5   2022-01-15     60
6   2022-01-16     70
7   2022-01-17     80
8   2022-01-18     90
9   2022-01-19    100
10  2022-01-20    110
df2.query('date == "2022-01-13"')
###
         date  value
3  2022-01-13     40
df2.query('date == "2022-01-15".format("%Y-%m-%d")')
###
         date  value
5  2022-01-15     60

CodePudding user response:

Given the comments then I suggest using pd.to_datetime() which outputs it in a datetime format. Such as:

df['date'] = pd.to_datetime(df['date'],infer_datetime_format=True)
  • Related