Too Many Afterthoughts The greatest ideas are often far too late


Partitioning InnoDB tables by time-based pseudo-sequential UUIDs

Time Twister Abstract Concept Illustration with Twisted Vintage Clock

Partitioning has multiple uses – spreading load onto multiple disks, cold storage of older data on cheaper disks, and probably others. Most importantly though, partitions are not for performance.

The main use case I’m going to explain is time-based partitioning as a tool for limiting the scope of stored data. This could be required to comply with a data retention policy or simply to save money on disk space.


Choose the right primary key to save a large amount of disk I/O

Cardboard boxes

Imagine you’re working in a large book warehouse and in charge of new arrivals. There’s a separate, digital system for metadata like authors, categories, etc., so the only information you’ll use during storage is the inventory number. Each book is identified by a unique number with many digits and all books must be findable by their number. To make handling quicker, books are packed in boxes, ordered by the inventory number. When looking for a book, the box must be identified first. Therefore, each box is labeled by the lowest inventory number it contains and the first number that’s in the next box.

Your job in the arrivals room is to pick up books-to-be-stored one by one, assign them a new inventory number in the metadata system, label them by number, and put them in a box as mentioned before. Now, the room is quite small and if you run out of space, you’ll need to move the filled boxes into the basement, which might be two floors down.

Too Many Afterthoughts The greatest ideas are often far too late