Home > Net >  Not able to calculate (DATA_LENGTH INDEX_LENGTH) correctly in MySQL innoDB
Not able to calculate (DATA_LENGTH INDEX_LENGTH) correctly in MySQL innoDB

Time:09-05

I'm using MySQL innoDB for storing data. I needed to restrict the size of a particular table by some value. Soo after inserting data to the table, i will check the (DATA_LENGTH INDEX_LENGTH) to see if it exceeded the limit. If exceeded i will delete some old data, until (DATA_LENGTH INDEX_LENGTH) will reach below the limit.

But i'm getting unexpected (DATA_LENGTH INDEX_LENGTH) value. so im not able to procced. (i have explained the situation below)

Initially i have a table tableTime with (DATA_LENGTH INDEX_LENGTH) as around 15GB and DATA_FREE as 0GB. It has 2,000,000 rows.

DELETE FROM tableTime
LIMIT 1000000;

i deleted 1,000,000 rows (using above query) then waited for mysqld.exe to complete the disk I/O (manually see through task manager) and rebooted the system also.

ANALYZE TABLE tableTime;
SELECT ROUND((DATA_LENGTH   INDEX_LENGTH) / 1024 / 1024) AS TABLE_NAME  FROM information_schema.TABLES
where TABLE_NAME = 'tableTime';

now i checked (DATA_LENGTH INDEX_LENGTH) again (using above query) i found it as ~13GB and DATA_FREE as ~2GB.

OPTIMIZE TABLE tableTime;
SELECT ROUND((DATA_LENGTH   INDEX_LENGTH) / 1024 / 1024) AS TABLE_NAME  FROM information_schema.TABLES
where TABLE_NAME = 'tableTime';

then i tried OPTIMIZE and got the expected like 8-9GB (DATA_LENGTH INDEX_LENGTH).

But since tables are big i cant do OPTIMIZE every time to get table size. why ANALYZE is this much inefficient.

CodePudding user response:

Data_length, index_length, and data_free are not measuring what you think they're measuring.

InnoDB stores data and indexes on pages, which are 16KB by default and are uniform in size. Pages are grouped into extents of 1MB, i.e. 64 pages by default.

When you insert or update data to a tablespace, InnoDB may expand the tablespace. Initial expansion is done by small amounts, after which expansion occurs 4MB at a time (4 extents).

Each page stores at least 2 rows, and stores as many rows as fit in the page (long varchar/text/blob columns are stored on separate pages).

As you update or delete rows, this might leave gaps of unused space in a given page. Further inserts and updates attempts to use these gaps if possible, but in practice they tend to become fragmented, with lots of small pockets of wasted space.

Here's a think you must understand: data_free measures only whole extents (64 pages) of unused space. This does not include the small pockets of wasted space. It doesn't even include empty pages, unless there are 64 contiguous empty pages. You might notice that data_free is always a multiple of 1MB.

The data_length and index_length measure the total size of the pages that are at least partially filled, not the size of the data in them.

This means that the actual size of data might be significantly less than what is reported by data_length index_length.

So how can you get a precise measure of the actual space occupied by data and indexes?

There's no command to measure precisely the size of data in an InnoDB table.

It's a difficult value to pin down anyway, because of transactions and multi-versioning. If you delete a row but you haven't committed that transaction yet, should you count the space? Likewise, if you've inserted a row but not committed, does that space count? What about multiple versions of a give row? What about rows that exist only in modified pages in the buffer pool, but they haven't been flushed to disk yet?

Waiting for I/O to settle isn't reliable either, because InnoDB continues to do gradual updates, things like purging old row versions and merging the change buffer into secondary indexes. This can even continue after a reboot.

CodePudding user response:

Notice how you started with 15GB but the first attempt left it at 15GB (13 2). This is because DELETE does not free up any disk space, or at least does not give it back to the OS, hence Data_length, etc show no total difference.

OPTIMIZE will rebuild the table. So, if innodb_file_per_table = ON when the table was [re]built, the space will be freed up. Otherwise, as Bill points out, it will simply be left for future INSERTs

As you probably noticed, DELETEing a million rows is terribly slow. You might want to consider several less-invasive ways, which I have compiled here

  • Related