Common MySQL Interview Questions
Storage Engine Comparison
- In InnoDB, primary key index and data are stored together, with the primary key index B+ tree containing complete data rows; in MyISAM, index and data are stored separately, with the B tree storing index values and physical addresses of rows
- InnoDB supports ACID transactions, MyISAM does not
- InnoDB supports row-level locking, MyISAM does not
What is Page Split
When inserting new data into a full index page (usually 16KB), the storage engine will split the current page into two pages to accommodate the new data
This process involves: allocating a new page, moving part of the data to the new page, and updating pointers of previous and next pages
Page splits cause performance degradation because they are relatively expensive operations
For pure auto-increment primary key sequential writes, InnoDB indeed prioritizes allocating new pages rather than splitting existing pages, which is an important reason for the performance advantage of auto-increment primary keys. However, in actual production environments, due to factors like concurrency, a small amount of page splitting may still occur.
Why Does It Happen?
Each node (page) must maintain order. When the primary key is non-auto-increment, we must ensure that all primary key values in the next data page are greater than the maximum primary key value in the previous data page.
Fill Factor
Defines the initial fill rate of pages (e.g., innodb_fill_factor=80 means pages are only filled to 80%).
Remaining space is used for subsequent insertions, reducing the probability of page splits.
Solution
| |
InnoDB’s Repeatable Read Isolation Level Solves Phantom Reads
For snapshot reads (normal SELECT): Through MVCC, at repeatable read isolation level, ReadView is created at the first query of the transaction
For current reads (SELECT…FOR UPDATE/LOCK IN SHARE MODE/INSERT/UPDATE/DELETE): Phantom reads are avoided through gap locks and Next-Key Locks.
SELECT * FROM t WHERE id > 100 FOR UPDATE will lock all gaps where id>100
Other Questions
There is an indexed field b. Which has higher execution efficiency: b = 1 or b = 2 vs b in (1, 2)?
b in (1, 2) is better
When the query optimizer processes b = 1 OR b = 2, it needs to split this query condition into two independent conditions, handling b = 1 and b = 2 separately, then merge these two results.
A field is of string type with an index. Now in the where condition, using an int type number to compare with the field, can it use the index? What if the field is int and the query condition is string?
No; Yes