Over the last few months, I've been involved designing a highly scalable system on top of MySQL's SKIP LOCKED feature. There turned out to be a few gotchas which I wanted to share.
Skip locked??
SKIP LOCKED
is part of SELECT
expression that lets you grab the rows from table that have not been locked by other transactions. You'd most often use that to build some sort of a ticketing system.
Let's consider the following schema:
CREATE TABLE `available_tickets` (
`id` bigint NOT NULL AUTO_INCREMENT,
`event_id` bigint NOT NULL,
PRIMARY KEY (`id`),
INDEX (event_id)
) ENGINE=InnoDB;
CREATE TABLE `reserved_tickets` (
`id` bigint NOT NULL AUTO_INCREMENT,
`quantity` int NOT NULL,
`cart_token` varchar(255) COLLATE utf8mb4_general_ci NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB;
You then fill available_tickets
with however many rows you'd like, matching the number of tickets you're willing to sell.
SKIP LOCKED
lets you run many concurrent transactions that do the following:
BEGIN;
SELECT id FROM available_tickets ORDER BY id LIMIT 10 FOR UPDATE SKIP LOCKED;
INSERT INTO reserved_tickets (quantity, cart_token) VALUES (10, `some-uuid`);
DELETE FROM available_tickets WHERE id IN (...)
COMMIT;
Before the introduction of SKIP LOCKED
in MySQL 8, this is not something you could achieve in a high concurrency fashion, without difference transactions waiting for shared locks.
Let's review the nuances of scaling this.
Tenancy
With a ticketing system like that, you'd most likely want to issue them by some sort of event_id
. In that case the query becomes:
BEGIN;
SELECT id FROM available_tickets
WHERE event_id=? # grab rows for a specific event
ORDER BY id LIMIT 10 FOR UPDATE SKIP LOCKED;
INSERT INTO reserved_tickets (quantity, cart_token) VALUES (10, `some-uuid`);
DELETE FROM available_tickets WHERE id IN (...)
COMMIT;
Let's see the side-effects of introducing that.
InnoDB locks
performance_schema.data_locks
is an internal table in InnoDB that lets you view live locks acquired by transactions. It's extremely useful for understanding what's happening and we even built a way to expose those in production to all developers as an observability tool.
As a refresher, after introducing event_id
our schema looks like this:
CREATE TABLE `available_tickets` (
`id` bigint NOT NULL AUTO_INCREMENT,
`event_id` bigint NOT NULL,
PRIMARY KEY (`id`),
KEY `event_id` (`event_id`)
) ENGINE=InnoDB
Let's inspect what locks happen in the middle of that transaction with skip locked:
mysql> begin;
Query OK, 0 rows affected (0.00 sec)
mysql> select id from available_tickets where event_id=1 ORDER BY id LIMIT 1 FOR UPDATE SKIP LOCKED;
+----+
| id |
+----+
| 2 |
+----+
1 row in set (0.01 sec)
mysql> SELECT thread_id, object_name, index_name, lock_type, lock_mode, lock_data FROM performance_schema.data_locks order by object_name;
+-----------+-------------------+------------+-----------+---------------+-----------+
| thread_id | object_name | index_name | lock_type | lock_mode | lock_data |
+-----------+-------------------+------------+-----------+---------------+-----------+
| 51 | available_tickets | NULL | TABLE | IX | NULL |
| 51 | available_tickets | event_id | RECORD | X | 1, 2 |
| 51 | available_tickets | PRIMARY | RECORD | X,REC_NOT_GAP | 2 |
+-----------+-------------------+------------+-----------+---------------+-----------+
3 rows in set (0.02 sec)
There's as many as 3 locks that it has caused! Actually, the table lock can be ignored (it prevents that table from being dropped or modified during the transaction). It's only the 2 locks that matter.
Because our SELECT
statement includes where event_id=1
, InnoDB has to track locks on both event_id
index and on the primary key.
I recommend reading A Comprehensive (and Animated) Guide to InnoDB Locking by my colleague Jahfer for a deeper dive into locks. Planetscale's post on B+trees is another great resource.
There's a way to reduce those two locks into just one that we're going to look into below.
Composite Primary Keys
Our table uses PRIMARY KEY (id)
, which means that whenever you do WHERE event_id=1
, InnoDB needs to do two operations:
- Look up
event_id
index, find1
on the B+tree. - Find that the first
id
under thatevent_id
is2
and look up that primary key
If you remove where event_id=1
, you may notice it becomes a single operation and a single lock.
By making our primary key a composite of (event_id, id)
, we could make that a single operation even with a where
condition.
Our schema becomes:
CREATE TABLE `available_tickets` (
`id` bigint NOT NULL AUTO_INCREMENT,
`event_id` bigint NOT NULL,
PRIMARY KEY (`event_id`, `id`),
KEY (`id`)
) ENGINE=InnoDB;
Let's look up the locks again:
mysql> begin;
Query OK, 0 rows affected (0.00 sec)
mysql> select id from available_tickets where event_id=1 ORDER BY id LIMIT 1 FOR UPDATE SKIP LOCKED;
+----+
| id |
+----+
| 2 |
+----+
1 row in set (0.00 sec)
mysql> SELECT thread_id, object_name, index_name, lock_type, lock_mode, lock_data FROM performance_schema.data_locks order by object_name;
+-----------+-------------------+------------+-----------+-----------+-----------+
| thread_id | object_name | index_name | lock_type | lock_mode | lock_data |
+-----------+-------------------+------------+-----------+-----------+-----------+
| 59 | available_tickets | NULL | TABLE | IX | NULL |
| 59 | available_tickets | PRIMARY | RECORD | X | 1, 2 |
+-----------+-------------------+------------+-----------+-----------+-----------+
2 rows in set (0.01 sec)
We've collapsed two locks into a single lock 🎉, and less locks means more performance.
Transaction isolation level
InnoDB's default transaction isolation level is REPEATABLE READ (source). Let's review what happens if you use that default and your tickets get sold out.
mysql> SELECT @@GLOBAL.transaction_isolation;
+--------------------------------+
| @@GLOBAL.transaction_isolation |
+--------------------------------+
| REPEATABLE-READ |
+--------------------------------+
1 row in set (0.00 sec)
mysql> begin;
Query OK, 0 rows affected (0.00 sec)
mysql> select id from available_tickets where event_id=1 ORDER BY id LIMIT 1 FOR UPDATE SKIP LOCKED;
Empty set (0.01 sec)
mysql> SELECT thread_id, object_name, index_name, lock_type, lock_mode, lock_data FROM performance_schema.data_locks order by object_name;
+-----------+-------------------+------------+-----------+-----------+------------------------+
| thread_id | object_name | index_name | lock_type | lock_mode | lock_data |
+-----------+-------------------+------------+-----------+-----------+------------------------+
| 59 | available_tickets | NULL | TABLE | IX | NULL |
| 59 | available_tickets | PRIMARY | RECORD | X | supremum pseudo-record |
+-----------+-------------------+------------+-----------+-----------+------------------------+
2 rows in set (0.01 sec)
That supremum pseudo-record
indicates there's a lock for the fact that there's no rows with event_id=1
.
Another way to think of it is that with REPEATABLE READ
, InnoDB has to guarantee that the current transaction will continue to get zero rows for the same query.
So it has to hold a lock on that empty range of the table.
That lock will prevent any other transaction from inserting more rows into available_tickets
, say if you realized you want to sell few more tickets.
The solution is to make sure you run with READ COMMITTED
.
mysql> set global TRANSACTION isolation level READ COMMITTED;
Query OK, 0 rows affected (0.00 sec)
mysql> begin;
Query OK, 0 rows affected (0.00 sec)
mysql> select id from available_tickets where event_id=1 ORDER BY id LIMIT 1 FOR UPDATE SKIP LOCKED;
Empty set (0.01 sec)
mysql> SELECT thread_id, object_name, index_name, lock_type, lock_mode, lock_data FROM performance_schema.data_locks order by object_name;
+-----------+-------------------+------------+-----------+-----------+-----------+
| thread_id | object_name | index_name | lock_type | lock_mode | lock_data |
+-----------+-------------------+------------+-----------+-----------+-----------+
| 60 | available_tickets | NULL | TABLE | IX | NULL |
+-----------+-------------------+------------+-----------+-----------+-----------+
1 row in set (0.00 sec)
No more extra supremum pseudo-record
locks!
I recommend checking InnoDB Locking Explained with Stick Figures by Bill Karwin if you want to dive deeper into how supremum pseudo-record
works.
Wrap up
I hope the examples above serve as a good illustration how looking at performance_schema.data_locks
can help you design a system with the least amount of locks, and can surface unexpected bits.
Employing those optimizations, I've seen a single MySQL server to easily handle 2M+ ticket reservation transactions per minute, and average latency of that SELECT ... FOR UPDATE SKIP LOCKED
query to stay under 400μS.
Further reading: