How to convert binary datetime from SQLServer to human readable date and time with Python-CodePudding

I have a value in a raw binary file (part of a database) and I want to convert to a Python format which can be interpreted by a human being. This is part of a forensic carving procedure. I can convert 8 byte values using this SQL sentence (you will see a date in GMT 2 and in GMT)

SELECT CAST(0x0000ae9401039c4a AS datetime), CAST(0x0000ae9400e2a6ca AS datetime)

which returns

2022-05-13 15:45:12.780 2022-05-13 13:45:12.780

I have tried to convert the binary value with DCODE v5.5 (

CodePudding user response：

Short Version

The legacy datetime type stores dates as a 64-bit floating point offset from 1900-01-01

floatValue=struct.unpack('<d',bytes)[0]
OLE_TIME_ZERO = datetime.datetime(1900, 01, 01, 0, 0, 0)
date=OLE_TIME_ZERO   datetime.timedelta(days=floatValue)

Newer types don't use that format though.

Excel handling libraries like openpyxl offer functions that convert OA/Serial dates like openpyxl.utils.datetime.from_excel

Long Explanation

The legacy datetime type in SQL Server uses the OLE Automation Date storage format that was also used in Excel, Visual Basic and all desktop applications that supported OLE/COM Automation in the late 1990s and early 2000s, before macro viruses. This is a 64-bit floating point number (called a Serial date in Excel) whose integral part is an offset since 1899-12-30 and fractional part is the time of day. Except when it's 1899-12-31 or 1900-01-01.

Back when Excel was released, Lotus 1-2-3 was the most popular spreadsheet and a defacto standard, and incorrectly considered 1900 a leap year. To ensure compatibility, Excel adopted the same bug. VBA adopted tried to both fix the bug and ensure formulas produced the same results as Excel and Lotus, so use 1899-12-30 as a base.

The SQL Server team didn't care about the bug and used the logical 1900-01-01 instead.

Essentially, this value is a timedelta. In Python, you can convert this float to a timedelta by passing it as the days parameter to timedelta, and add it to the base 1900-01-01:

OLE_TIME_ZERO = datetime.datetime(1900, 01, 01, 0, 0, 0)
date=OLE_TIME_ZERO   datetime.timedelta(days=floatValue)

To get a 64-bit float from an array of bytes you can use struct.unpack with the appropriate format string. A 64-bit float is actually a double:

floatValue=struct.unpack('<d',bytes)[0]

WARNING

datetime is a legacy type. The types introduced in 2005, date, time, datetime2 and datetimeoffset have a different storage format. datetime2 and datetimeoffset have variable precision and variable size.

CodePudding user response：

For future reference, I was finally able to find the data I needed in this post: https://www.faqcode4u.com/faq/108331/what-is-the-internal-representation-of-datetime-in-sql-server

https://www.faqcode4u.com/faq/108331/what-is-the-internal-representation-of-datetime-in-sql-server

The details are supposedly opaque, but most resources (1), (2) that I've found on the web state the following:

The first 4 bytes store the number of days since SQL Server's epoch (1st Jan 1900) and that

the second 4 bytes stores the number of ticks after midnight, where a "tick" is 3.3 milliseconds.

The first four bytes are signed (can be positive or negative), which explains why dates earlier than the epoch can be represented.

https://docs.microsoft.com/en-us/sql/t-sql/functions/date-and-time-data-types-and-functions-transact-sql?redirectedfrom=MSDN&view=sql-server-ver16

Range: 1753-01-01 through 9999-12-31

Accuracy: 0.00333 second

This function will do the conversion:

def extr_datetime (bytes):
    days_off = int.from_bytes(bytes[4:8],byteorder='little', signed=True)
    ticks_off = int.from_bytes(bytes[0:4],byteorder='little', signed=True) / 300.0
    epoch = '1900/01/01 00:00:00'
    epoch_obj = datetime.strptime(epoch, '%Y/%m/%d %H:%M:%S')
    d = epoch_obj   timedelta(days=days_off)   timedelta(seconds=ticks_off)
    return d