I have a value in a raw binary file (part of a database) and I want to convert to a Python format which can be interpreted by a human being. This is part of a forensic carving procedure. I can convert 8 byte values using this SQL sentence (you will see a date in GMT 2 and in GMT)
SELECT CAST(0x0000ae9401039c4a AS datetime), CAST(0x0000ae9400e2a6ca AS datetime)
which returns
2022-05-13 15:45:12.780 2022-05-13 13:45:12.780
I have tried to convert the binary value with DCODE v5.5 (
CodePudding user response:
Short Version
The legacy datetime
type stores dates as a 64-bit floating point offset from 1900-01-01
floatValue=struct.unpack('<d',bytes)[0]
OLE_TIME_ZERO = datetime.datetime(1900, 01, 01, 0, 0, 0)
date=OLE_TIME_ZERO datetime.timedelta(days=floatValue)
Newer types don't use that format though.
Excel handling libraries like openpyxl
offer functions that convert OA/Serial dates like openpyxl.utils.datetime.from_excel
Long Explanation
The legacy datetime
type in SQL Server uses the OLE Automation Date storage format that was also used in Excel, Visual Basic and all desktop applications that supported OLE/COM Automation in the late 1990s and early 2000s, before macro viruses. This is a 64-bit floating point number (called a Serial date in Excel) whose integral part is an offset since 1899-12-30 and fractional part is the time of day. Except when it's 1899-12-31 or 1900-01-01.
Back when Excel was released, Lotus 1-2-3 was the most popular spreadsheet and a defacto standard, and incorrectly considered 1900 a leap year. To ensure compatibility, Excel adopted the same bug. VBA adopted tried to both fix the bug and ensure formulas produced the same results as Excel and Lotus, so use 1899-12-30 as a base.
The SQL Server team didn't care about the bug and used the logical 1900-01-01 instead.
Essentially, this value is a timedelta
. In Python, you can convert this float to a timedelta by passing it as the days
parameter to timedelta
, and add it to the base 1900-01-01:
OLE_TIME_ZERO = datetime.datetime(1900, 01, 01, 0, 0, 0)
date=OLE_TIME_ZERO datetime.timedelta(days=floatValue)
To get a 64-bit float from an array of bytes you can use struct.unpack with the appropriate format string. A 64-bit float is actually a double:
floatValue=struct.unpack('<d',bytes)[0]
WARNING
datetime
is a legacy type. The types introduced in 2005, date
, time
, datetime2
and datetimeoffset
have a different storage format. datetime2
and datetimeoffset
have variable precision and variable size.
CodePudding user response:
For future reference, I was finally able to find the data I needed in this post: https://www.faqcode4u.com/faq/108331/what-is-the-internal-representation-of-datetime-in-sql-server
- https://www.faqcode4u.com/faq/108331/what-is-the-internal-representation-of-datetime-in-sql-server
- The details are supposedly opaque, but most resources (1), (2) that I've found on the web state the following:
- The first 4 bytes store the number of days since SQL Server's epoch (1st Jan 1900) and that
- the second 4 bytes stores the number of ticks after midnight, where a "tick" is 3.3 milliseconds.
- The first four bytes are signed (can be positive or negative), which explains why dates earlier than the epoch can be represented.
- https://docs.microsoft.com/en-us/sql/t-sql/functions/date-and-time-data-types-and-functions-transact-sql?redirectedfrom=MSDN&view=sql-server-ver16
- Range: 1753-01-01 through 9999-12-31
- Accuracy: 0.00333 second
This function will do the conversion:
def extr_datetime (bytes):
days_off = int.from_bytes(bytes[4:8],byteorder='little', signed=True)
ticks_off = int.from_bytes(bytes[0:4],byteorder='little', signed=True) / 300.0
epoch = '1900/01/01 00:00:00'
epoch_obj = datetime.strptime(epoch, '%Y/%m/%d %H:%M:%S')
d = epoch_obj timedelta(days=days_off) timedelta(seconds=ticks_off)
return d