MySQL 8.0.30 introduces the “Generated Invisible Primary Key” (GIPK) feature. This blog provides an introduction to this feature, discusses its limitations and restrictions, and describes how backup and restore operations work with this feature.Let’s get started!
Background
InnoDB storage engine incorporates an internal feature known as the Implicit Primary Key. This feature automatically generates a hidden clustered index named GEN_CLUST_INDEX on a generated column called DB_ROW_ID. Implicit Primary Key is generated when a table is created without a PRIMARY KEY or UNIQUE KEY, where all key columns are defined as NOT NULL. The DB_ROW_ID column itself is a 6-byte field that increases monotonically as new rows are inserted into the table. The Implicit Primary Key serves as an mechanism to ensure to address limitations associated with table locking primary key. However, alongside its advantages, there are some downsides to consider when dealing with the Implicit Primary Key,
- Scalability Issue: All tables with the Implicit Primary Key utilize a sequence counter common for all tables to generate unique 6-byte values for the DB_ROW_ID column. However, this counter is protected by a mutex, which can introduce a scalability concern. Concurrent insertions into tables with the Implicit Primary Key may lead to a performance issue due to contention for the mutex.
- Not suitable for replication setups (Both Classic and High Availability):The use of Implicit Primary Key presents challenges in replication environments due to its reliance on a common sequence counter to generate unique values. For the same row of a table, it is not possible to generate the same value for DB_ROW_ID across multiple replication nodes. As a result, Implicit Primary Key cannot be used in replication setups. This limitation means that tables with an implicit primary key are not supported in high-availability setups, and classic replication does not perform efficiently with such tables.
Downsides of InnoDB Implicit Primary Key motivates to implement new GIPK feature in MySQL. The GIPK feature offers significant convenience for developers and database administrators as it automatically generates a primary key on an invisible column. This eliminates the need to explicitly define a primary key in certain scenarios, while retaining the advantages of having a primary key for ensuring data consistency and optimizing queries. Moreover, the GIPK feature helps address limitations associated with tables lacking a primary key. For instance, Group Replication, a high-availability solution in MySQL, does not support tables without a primary key. Additionally, classic replication may not perform optimally or efficiently with tables that lack a primary key. Hence, the GIPK feature enables developers to overcome these limitations and ensure the smooth operation of their MySQL databases.
GIPK Introduction
When a user attempts to create an InnoDB table without a primary key, MySQL automatically generates a primary key for the table when this feature is enabled. By default, this feature is disabled. To enable it, set the “sql_generate_invisible_primary_key” variable to “ON”.
Here’s an example to demonstrate the difference in table creation in sans-GIPK mode (GIPK feature disabled) and GIPK mode (GIPK feature enabled):
In sans-GIPK mode
In this mode, the existing InnoDB mechanism to create a hidden Implicit Primary Key is used to generate the primary key for tables
In GIPK mode
In GIPK mode, MySQL automatically adds an invisible column named “my_row_id” of type “bigint unsigned NOT NULL auto_increment” as the first column in the table’s column list. Additionally, a primary key is added on the “my_row_id” column.
When a primary key is not explicitly defined for an InnoDB table in a MySQL, the database considers a UNIQUE KEY with all key columns defined as NOT NULL as an implicit primary key. When GIPK is enabled, a primary key is generated even for tables that are created with an implicit primary key.
Here is an example to demonstrate this:
Adding GIPK to an already existing tables
In GIPK mode, a primary key is automatically added to the table only when the table is created. Altering a table does not automagically add a primary key for a table without a primary key. Invisible Primary Key can be added to existing table as below,
- In a stand-alone DB system user can add GIPK to a table as below,
ALTER TABLE users ADD COLUMN my_row_id BIGINT UNSIGNED NOT NULL INVISIBLE AUTO_INCREMENT PRIMARY KEY FIRST;
|
- In a replication setup, user can follow this multi-phase process to add GIPK to a table at source,
After adding the Generated Invisible Primary Key (GIPK) to an existing table, InnoDB’s Implicit Primary Key on the hidden column DB_ROW_ID and the column DB_ROW_ID itself are abandoned.
Find GIPK information in SHOW and INFORMATION_SCHEMA tables
Users can view the Generated Invisible Primary Key (GIPK) information in the output of the SHOW CREATE TABLE, SHOW COLUMNS, and SHOW INDEX statements. Additionally, the GIPK information is visible in the Information Schema COLUMNS and STATISTICS tables.
To hide the Generated Invisible Primary Key (GIPK) information in the SHOW and information schema COLUMNS and STATISTICS result sets, users can set the system variable “show_gipk_in_create_table_and_information_schema” to “OFF”. By default, this variable is set to “ON”. Here is an example to demonstrate this:
Backup and restore:
Users can create a logical backup of their database using the mysqldump and mysqlpump tools. By default, these tools include the Generated Invisible Primary Key (GIPK) along with the column definitions in the table definition, as well as the values of generated columns in the logical dump. This means that when restoring a database from the logical dump, the tables will be created with the GIPK and the corresponding values for the generated columns.
To provide more flexibility, both mysqldump and mysqlpump offer a new option called “–skip-generated-invisible-primary-key“. By using this option, users can create a logical dump without including the GIPK information. This can be useful if the user wants to exclude the generated primary keys from the dump. However, it’s important to note that if the logical dump is restored in GIPK mode, then a primary key will be generated for any tables that do not have a primary key defined.
On the other hand, physical backups always include the GIPK information for the tables. When restoring from a physical backup, the GIPK mode is not applicable, as the backup contains the actual table data along with the generated primary keys.
Overall, the mysqldump and mysqlpump tools provide users with options to include or exclude the GIPK information in their logical backups, while physical backups always preserve the GIPK for the tables.
Limitations and Restrictions of GIPK:
- GIPK feature is applicable to InnoDB tables only. For tables created in other storage engines, the primary key is not generated.
- If the table being created already has a column with the name “my_row_id”, then the primary key is not generated.
- This feature is not supported for partitioned tables at the moment. If a partitioned table is created without a primary key, then an error is reported.
- The workaround is to explicitly create a primary key on such tables. If there is no suitable natural primary key, a primary key consisting of partitioning columns and an invisible column with a UUID as the default value can be used. However, when adding such a primary key to existing tables with binary log enabled, there may be an issue with ALTER TABLE statements adding an invisible column with a UUID-based default. To work around this issue, a multi-phase process needs to be followed.
ALTERTABLEt1ADDCOLUMNmy_row_id VARBINARY(16)ALTERTABLEt1 CHANGECOLUMNmy_row_idSETDEFAULT(UUID_TO_BIN(UUID())LOCK TABLES t1 WRITE;-- This statement will be binlogged using ROW format!UPDATETABLEt1SETrow_id = UUID_TO_BIN(UUID())WHEREmy_row_idISNULL;ALTERTABLEt1ADDPRIMARYKEY(part_col1, ... part_colN, my_row_id);UNLOCK TABLES;
- The workaround is to explicitly create a primary key on such tables. If there is no suitable natural primary key, a primary key consisting of partitioning columns and an invisible column with a UUID as the default value can be used. However, when adding such a primary key to existing tables with binary log enabled, there may be an issue with ALTER TABLE statements adding an invisible column with a UUID-based default. To work around this issue, a multi-phase process needs to be followed.
- If the table being created already has an AUTO_INCREMENT column, then the primary key is not added.
- A possible workaround is to manually create a UUID-based invisible primary key using the multi-phase process described for the partitioned tables.
- If no primary key is specified in the CREATE TABLE … SELECT statement, a primary key is generated. However, if the SELECT statement includes a column with the name “my_row_id”, a primary key is not generated, and an error is reported.
- The workaround for this case is to explicitly add a primary key on the “my_row_id” column within the same CREATE TABLE … SELECT statement, as shown in the example below:
- The workaround for this case is to explicitly add a primary key on the “my_row_id” column within the same CREATE TABLE … SELECT statement, as shown in the example below:
- When using statement based binary logging, replicating CREATE TABLE … SELECT with generated invisible primary key is not safe, so an error is reported in this case.
- If source table of CREATE TABLE … LIKE does not have primary key, then the primary key is not generated for the target table. CREATE TABLE … LIKE is often used in the situation where a target table definition must be like a source table definition. So to keep the same behavior, the primary key is not added to the target table.
- ALTER TABLE operation to CHANGE/MODIFY/ALTER/RENAME “my_row_id” column is not allowed. An error is generated in this case.
- Changing VISIBILITY of the “my_row_id” is an exception. This operation is allowed on the “my_row_id”. Please note, changing visibility of “my_row_id” column effects the properties of Generated Invisible Primary Key too.
- ALTER TABLE … DROP PRIMARY KEY to drop Generated Invisible Primary Key, without dropping “my_row_id” column in the same statement, is not allowed.
- In GIPK mode, ALTER TABLE operations which results in a table without primary key are not allowed. An error is reported in this case.
References
- https://dev.mysql.com/doc/refman/8.0/en/create-table-gipks.html
- https://dev.mysql.com/doc/refman/8.0/en/server-system-variables.html#sysvar_sql_generate_invisible_primary_key
- https://dev.mysql.com/doc/refman/8.0/en/invisible-columns.html
- https://dev.mysql.com/doc/refman/8.0/en/server-system-variables.html#sysvar_show_gipk_in_create_table_and_information_schema
- https://dev.mysql.com/doc/refman/8.0/en/mysqldump.html#option_mysqldump_skip-generated-invisible-primary-key
- https://dev.mysql.com/doc/refman/8.0/en/mysqlpump.html#option_mysqlpump_skip-generated-invisible-primary-key
- https://dev.mysql.com/doc/refman/8.0/en/innodb-index-types.html
Please check out this new feature and let us know your feedback.
Thanks for using MySQL!
