is there a way to know which values are missing in a given incomplete date when using dateutil parse-CodePudding

I am working with a feature to generate a duration once a date(can be incomplete) is specified by the user.

if it is a complete date (xxxx-xx-xx), the period should be xxxx-xx-xx 0:00 to xxxx-xx-xx 23:59.
if it is an incomplete date only with year and month xxxx-xx, the period should be xxxx-xx-01 0:00 to xxxx-xx-31 23:59.
if it is an incomplete date only with year xxxx, the period should be xxxx-01-01 0:00 to xxxx-12-31 23:59.

user inputs can be in any form.

I am using this super cool dateutil.parse which is working smoothly in capturing date information. It can also capture incomplete dates and set what missing to a default value. Ex:

out_date = dparser.parse("2021",default=datetime.datetime(2021, 1, 1)) 
#out  2021-01-01 00:00:00
out_date = dparser.parse("2021 june",default=datetime.datetime(2021, 1, 1))
#out  2021-06-01 00:00:00
out_date = dparser.parse("2021 sep 17",default=datetime.datetime(2021, 1, 1))
#out  2021-09-17 00:00:00

As my concern is to return a period, I want to differentiate between inputs "2021" and "2021-01-01" where the output will be the same with the above method.

Basically, I want is to capture the missing values so that I can process those as I wish. Ex:

input = "2021" > output = (2021, false, false)
input = "2021 jan" > output = (2021, 01, false)
input = "2021 jan 15" > output = (2021, 01, 15)

Any help would be highly appreciated.

PS: inputs can be out of structure such as "2021/01/01", "January 15, 2021", "January 2021"

CodePudding user response：

If this is what you need:

input = "2021" > output = (2021, false, false)
input = "2021 jan" > output = (2021, 01, false)
input = "2021 jan 15" > output = (2021, 01, 15)

Then this should work:

from datetime import datetime

def process_date_str(date_str: str) -> tuple:
    tokens = date_str.split()
    
    if len(tokens) == 1:
        year = tokens[0]
        return year, False, False
    
    if len(tokens) == 2:
        year, month = tokens
        month = str(datetime.strptime(month, "%b").month).zfill(2)
        return year, month, False

    if len(tokens) == 3:
        year, month, day = tokens
        month = str(datetime.strptime(month, "%b").month).zfill(2)
        return year, month, day


if __name__ == "__main__":
    year_only = "2021"
    year_month = "2021 jan"
    year_month_day = "2021 jan 15"

    date_strs = [year_only, year_month, year_month_day]

    for date_str in date_strs:
        print(process_date_str(date_str))

That gives me:

(.venv) ➜  date-parser python main.py
('2021', False, False)
('2021', '01', False)
('2021', '01', '15')

I left everything as a str, but you could convert if need be. It was hard to tell the desired data type from your example.

CodePudding user response：

The following trick worked for me. Hope this would help anybody looking for such a scenario.

import datetime
import dateutil.parser as dparser

def date_checker(date_input):
    out_dict = ['false','false','false']
    out_check_a = dparser.parse(date_input,default=datetime.datetime(2020, 1, 1))
    out_check_b = dparser.parse(date_input, default=datetime.datetime(2021, 12, 31))
    if(out_check_a.year==out_check_b.year):
        out_dict[0] = out_check_a.year
    if (out_check_a.month == out_check_b.month):
        out_dict[1] = out_check_a.month
    if (out_check_a.day == out_check_b.day):
        out_dict[2] = out_check_a.day
    return out_dict
print(date_checker('2021')) #output [2021, 'false', 'false']
print(date_checker('2021 jan')) #output [2021, 1, 'false']
print(date_checker('2021 jan 15')) #output [2021, 1, 15]
print(date_checker('2021/01/01')) #output [2021, 1, 1]
print(date_checker('January 15, 2021')) #output [2021, 1, 15]
print(date_checker('January 2021')) #output [2021, 1, 'false']