Home > database >  is there a way to know which values are missing in a given incomplete date when using dateutil parse
is there a way to know which values are missing in a given incomplete date when using dateutil parse

Time:11-21

I am working with a feature to generate a duration once a date(can be incomplete) is specified by the user.

  1. if it is a complete date (xxxx-xx-xx), the period should be xxxx-xx-xx 0:00 to xxxx-xx-xx 23:59.
  2. if it is an incomplete date only with year and month xxxx-xx, the period should be xxxx-xx-01 0:00 to xxxx-xx-31 23:59.
  3. if it is an incomplete date only with year xxxx, the period should be xxxx-01-01 0:00 to xxxx-12-31 23:59.

user inputs can be in any form.

I am using this super cool dateutil.parse which is working smoothly in capturing date information. It can also capture incomplete dates and set what missing to a default value. Ex:

out_date = dparser.parse("2021",default=datetime.datetime(2021, 1, 1)) 
#out  2021-01-01 00:00:00
out_date = dparser.parse("2021 june",default=datetime.datetime(2021, 1, 1))
#out  2021-06-01 00:00:00
out_date = dparser.parse("2021 sep 17",default=datetime.datetime(2021, 1, 1))
#out  2021-09-17 00:00:00

As my concern is to return a period, I want to differentiate between inputs "2021" and "2021-01-01" where the output will be the same with the above method.

Basically, I want is to capture the missing values so that I can process those as I wish. Ex:

input = "2021" > output = (2021, false, false)
input = "2021 jan" > output = (2021, 01, false)
input = "2021 jan 15" > output = (2021, 01, 15)

Any help would be highly appreciated.

PS: inputs can be out of structure such as "2021/01/01", "January 15, 2021", "January 2021"

CodePudding user response:

If this is what you need:

input = "2021" > output = (2021, false, false)
input = "2021 jan" > output = (2021, 01, false)
input = "2021 jan 15" > output = (2021, 01, 15)

Then this should work:

from datetime import datetime

def process_date_str(date_str: str) -> tuple:
    tokens = date_str.split()
    
    if len(tokens) == 1:
        year = tokens[0]
        return year, False, False
    
    if len(tokens) == 2:
        year, month = tokens
        month = str(datetime.strptime(month, "%b").month).zfill(2)
        return year, month, False

    if len(tokens) == 3:
        year, month, day = tokens
        month = str(datetime.strptime(month, "%b").month).zfill(2)
        return year, month, day


if __name__ == "__main__":
    year_only = "2021"
    year_month = "2021 jan"
    year_month_day = "2021 jan 15"

    date_strs = [year_only, year_month, year_month_day]

    for date_str in date_strs:
        print(process_date_str(date_str))

That gives me:

(.venv) ➜  date-parser python main.py
('2021', False, False)
('2021', '01', False)
('2021', '01', '15')

I left everything as a str, but you could convert if need be. It was hard to tell the desired data type from your example.

CodePudding user response:

The following trick worked for me. Hope this would help anybody looking for such a scenario.

import datetime
import dateutil.parser as dparser

def date_checker(date_input):
    out_dict = ['false','false','false']
    out_check_a = dparser.parse(date_input,default=datetime.datetime(2020, 1, 1))
    out_check_b = dparser.parse(date_input, default=datetime.datetime(2021, 12, 31))
    if(out_check_a.year==out_check_b.year):
        out_dict[0] = out_check_a.year
    if (out_check_a.month == out_check_b.month):
        out_dict[1] = out_check_a.month
    if (out_check_a.day == out_check_b.day):
        out_dict[2] = out_check_a.day
    return out_dict
print(date_checker('2021')) #output [2021, 'false', 'false']
print(date_checker('2021 jan')) #output [2021, 1, 'false']
print(date_checker('2021 jan 15')) #output [2021, 1, 15]
print(date_checker('2021/01/01')) #output [2021, 1, 1]
print(date_checker('January 15, 2021')) #output [2021, 1, 15]
print(date_checker('January 2021')) #output [2021, 1, 'false']
  • Related