Home > database >  Can sqlite3_step stream results?
Can sqlite3_step stream results?

Time:12-25

Can sqlite3_step stream the results of a query from disk in some cases, without loading them all into memory at once?

For example

SELECT * FROM my_table ORDER BY rowid

would this load all the results into memory before the first call to sqlite3_step has returned?

Even if no, are there other cases when all results would be loaded into memory at once? How can this be predicted ahead of time?

My aim here is to understand how SQLite uses memory internally, with a view to using it in a memory constrained environment

CodePudding user response:

After some testing, I think the answer is yes, it can stream results.

In https://github.com/michalc/sqlite-s3-query/, sqlite3_step is used under the hood to fetch results from a SQLite file via range HTTP requests to S3. In https://github.com/michalc/sqlite-s3-query/pull/25/files a custom HTTP client is setup to store how many rows of a result are fetched when each HTTP request is made.

In the above, the query

SELECT my_col_a FROM my_table

Allows the assertion to pass

self.assertEqual(rows_yielded_at_request, [
    (0, None),
    (0, 'bytes=0-99'),
    (0, 'bytes=0-4095'),
    (0, 'bytes=24-39'),
    (0, 'bytes=4096-8191'),
    (0, 'bytes=8192-12287'),
    (140, 'bytes=12288-16383'),
    (276, 'bytes=16384-20479'),
    (412, 'bytes=20480-24575'),
])

shows that yielding of rows to client code is intermingled with the fetching of the raw data from the SQLite file.

And in https://github.com/michalc/sqlite-s3-query/pull/27/files, the query

SELECT my_col_a FROM my_table ORDER BY my_col_a

makes the assertion pass

self.assertEqual(rows_yielded_at_request, [
    (0, None),
    (0, 'bytes=0-99'),
    (0, 'bytes=0-4095'),
    (0, 'bytes=24-39'),
    (0, 'bytes=4096-8191'),
    (0, 'bytes=8192-12287'),
    (0, 'bytes=12288-16383'),
    (0, 'bytes=16384-20479'),
    (0, 'bytes=20480-24575'),
])

showing that in this case it's not streaming. In this case, there is no index on the column my_col_a, so SQLite has to fetch all the rows before knowing which one could be first.

  • Related