Mask and De-identify Data in MySQL Enterprise Edition

September 6, 2024 | 19 minute read
Nicolas De Rico
MySQL Solutions Engineer
Text Size 100%:

MySQL Enterprise Edition has for years featured methods for masking and de-identifying data.  Let's explore, with examples, how to use them!

 

Agenda of the article

 

 

The Data Masking and De-Identification extensions of MySQL Enterprise Edition

 

Data masking refers to masking portions or all of the data by replacing some of its content with a masking character.  For example, the SSN 123-45-6789 can be thus masked: xxx-xx-6789, to only display the last 4 digits.  Data de-identification changes a value to a different value in a way that it is still useful to the application, for example by replacing a unique value with another unique value, but makes it impossible to know the original value.  As the name implies, the purpose de-identification is to prevent identification of private data.

Data masking and data de-identification functionality is provided as an extension of MySQL Enterprise Edition in the form of both a plugin and a component.  Plugins are the original mechanism for extending MySQL functionality.  They provide an API to extend various functions of the server.  For example, there is a plugin API type for authentication methods, one plugin API type for storage engines, and so on.  MySQL components are the new kid-on-the-block, introduced in MySQL 8.0.  The component framework extends the server functionality like plugins but offers much more functionality to the developers as well as it allows individual components to interact with each other, which plugins are unable to do.

The reason I bring this up is because MySQL Enterprise Edition has both a data masking and de-identification plugin and component installed.  And they both provide similar but different functionality!  The table below contains a list of differences between the functionality provided by the plugin and that provided by the component.  The database administrator must be careful to use only either the plugin or the component as only one of them can be enabled at a time.  Of the two, the component should be favored as it provides enhanced functionality and is actively developed, whereas the plugin is subject to removal in a future release (due and timely notice will be provided).

Function Plugin Component
Interface Loadable functions Service functions, loadable functions
Support for multibyte character sets No Yes for general-purpose masking functions
General-purpose masking functions mask_inner(), mask_outer() mask_inner(), mask_outer
Masking of specific types PAN, SSN PAN, SSN, IBAN, UUID, Canada SIN, UK NIN
Random generation of specific types Email, US phone, PAN, SSN Email, US phone, PAN, SSN, IBAN, UUID, Canada SIN, UK NIN
Random generation of integer from given ranges Yes Yes
Persisting substitution dictionaries File-based In database
Privilege to manage dictionaries FILE Privilege specific to component (MASKING_DICTIONARIES_ADMIN)

Automated loadable function registration and deregistration

during installation and uninstallation

No Yes
Enhancements to existing functions N/A More arguments to the function gen_rnd_email()

 

The steps to install the component are the following (note: it comes as two component files):

Copied to Clipboard
Error: Could not Copy
Copied to Clipboard
Error: Could not Copy
INSTALL COMPONENT 'file://component_masking';
INSTALL COMPONENT 'file://component_masking_functions';

 

To confirm that the component is installed:

Copied to Clipboard
Error: Could not Copy
Copied to Clipboard
Error: Could not Copy
mysql> SELECT * FROM `component`;
+--------------+--------------------+------------------------------------+
| component_id | component_group_id | component_urn                      |
+--------------+--------------------+------------------------------------+
| ...          | ...                | ...                                |
|            2 |                  2 | file://component_masking           |
|            3 |                  3 | file://component_masking_functions |
+--------------+--------------------+------------------------------------+

 

Example Dataset

 

This article contains several practical examples that use the following dataset:

First Name Last Name DOB SSN Phone Number Card Number
George Washington 1732-02-22 NULL (804)555-0101 4716958491933491
John Adams 1735-10-30 756-16-0134 (508)555-0102 5548026033569479
Thomas Jefferson 1743-04-13 647-42-4787 NULL 341605099040689
James Madison 1751-03-16 939-87-6230 (276)555-0104 6223050740560951
James Monroe 1758-04-28 615-50-9245 (571)555-0105 4916285199808116
John Adams 1767-07-11 677-54-5716 (617)555-0106 NULL
Andrew Jackson 1767-03-15 302-80-6308 (803)555-0107 370325552763685
Martin Van Buren 1782-12-05 133-85-9284 (329)555-0108 4024007112991004
William Harrison 1773-02-09 397-35-5070 (757)555-0109 5535714026624691
John Tyler 1790-03-29 037-67-9069 (804)555-0110 374077354201234

 

Here is the SQL to provision the above dataset:

Copied to Clipboard
Error: Could not Copy
Copied to Clipboard
Error: Could not Copy
CREATE SCHEMA IF NOT EXISTS `test`;
CREATE TABLE `test`.`persons`
(
  `id` INT PRIMARY KEY AUTO_INCREMENT,
  `First Name` VARCHAR(32) NOT NULL,
  `Last Name` VARCHAR(32) NULL,
  `DOB` DATE NOT NULL,
  `SSN` CHAR(11) NULL,
  `Phone Number` CHAR(13) NULL,
  `Card Number` VARCHAR(16) NULL
);

INSERT INTO `test`.`persons`
  ( `First Name`, `Last Name`,  `DOB`,        `SSN`,         `Phone Number`,  `Card Number`      )
VALUES
  ( 'George',     'Washington', '1732-02-22', NULL,          '(804)555-0101', '4716958491933491' ),
  ( 'John',       'Adams',      '1735-10-30', '756-16-0134', '(508)555-0102', '5548026033569479' ),
  ( 'Thomas',     'Jefferson',  '1743-04-13', '647-42-4787', NULL,            '341605099040689'  ),
  ( 'James',      'Madison',    '1751-03-16', '939-87-6230', '(276)555-0104', '6223050740560951' ),
  ( 'James',      'Monroe',     '1758-04-28', '615-50-9245', '(571)555-0105', '4916285199808116' ),
  ( 'John',       'Adams',      '1767-07-11', '677-54-5716', '(617)555-0106', NULL               ),
  ( 'Andrew',     'Jackson',    '1767-03-15', '302-80-6308', '(803)555-0107', '370325552763685'  ),
  ( 'Martin',     'Van Buren',  '1782-12-05', '133-85-9284', '(329)555-0108', '4024007112991004' ),
  ( 'William',    'Harrison',   '1773-02-09', '397-35-5070', '(757)555-0109', '5535714026624691' ),
  ( 'John',       'Tyler',      '1790-03-29', '037-67-9069', '(804)555-0110', '374077354201234'  );

 

Data Masking and De-Identification in SELECT statements

 

Below is an example using the data masking and de-identification functions.  These functions can be used anywhere that SQL functions may normally be used such as within WHERE clauses, projection lists, and elsewhere.  The documentation for these functions can be found here.

Copied to Clipboard
Error: Could not Copy
Copied to Clipboard
Error: Could not Copy
mysql> SELECT
    ->   `First Name`,
    ->   `Last Name`,
    ->   `DOB`,
    ->   mask_ssn(`SSN`,'x') AS `SSN`,
    ->   mask_inner(`Phone Number`,5,5,'*') AS `Phone Number`,
    ->   `Card Number`,
    ->   gen_rnd_pan(16) AS `Random Number`
    -> FROM
    ->   `test`.`persons`;
+------------+------------+------------+-------------+---------------+------------------+------------------+
| First Name | Last Name  | DOB        | SSN         | Phone         | Card Number      | Random Number    |
+------------+------------+------------+-------------+---------------+------------------+------------------+
| George     | Washington | 1732-02-22 | NULL        | (804)***-0101 | 4716958491933491 | 4701543887067000 |
| John       | Adams      | 1735-10-30 | xxx-xx-0134 | (508)***-0102 | 5548026033569479 | 4812011388774058 |
| Thomas     | Jefferson  | 1743-04-13 | xxx-xx-4787 | NULL          | 341605099040689  | 8206380111725769 |
| James      | Madison    | 1751-03-16 | xxx-xx-6230 | (276)***-0104 | 6223050740560951 | 0102261140601415 |
| James      | Monroe     | 1758-04-28 | xxx-xx-9245 | (571)***-0105 | 4916285199808116 | 6123363455567462 |
| John       | Adams      | 1767-07-11 | xxx-xx-5716 | (617)***-0106 | NULL             | 8357707102536143 |
| Andrew     | Jackson    | 1767-03-15 | xxx-xx-6308 | (803)***-0107 | 370325552763685  | 1202083866771074 |
| Martin     | Van Buren  | 1782-12-05 | xxx-xx-9284 | (329)***-0108 | 4024007112991004 | 0847031686483639 |
| William    | Harrison   | 1773-02-09 | xxx-xx-5070 | (757)***-0109 | 5535714026624691 | 6107678513862674 |
| John       | Tyler      | 1790-03-29 | xxx-xx-9069 | (804)***-0110 | 374077354201234  | 4335451458208116 |
+------------+------------+------------+-------------+---------------+------------------+------------------+
10 rows in set (0.00 sec)

The fields SSN and Phone Number are masked with specialized data masking functions, and random card numbers are generated that pass validation (Luhn).  Note how the original NULL values remain NULL when masked.  The randomly generated card number values, which are presented in a seperate column only to show that they are different from the original values, do not have NULL values because the generation function does not take into account the original values.  To account for them would require the use of a "CASE" clause in the projection list in the style of:

Copied to Clipboard
Error: Could not Copy
Copied to Clipboard
Error: Could not Copy
CASE
  WHEN `Card Number` IS NULL THEN NULL
  ELSE gen_rnd_pan(16)
END AS `Random Card`

 

SELECT INTO and CREATE TABLE AS SELECT

 

The SELECT ... INTO and CREATE TABLE ... AS SELECT statements work similarly as SELECT statements.  The data masking and de-identification functions can be used anywhere SQL functions may be used.

Here is an example of a CREATE TABLE ... AS SELECT statement.  To add value to the example, I have included the "CASE" clause that was mentioned above.

Copied to Clipboard
Error: Could not Copy
Copied to Clipboard
Error: Could not Copy
CREATE TABLE `test`.`persons2`
(
  `id` INT PRIMARY KEY AUTO_INCREMENT,
  `First Name` VARCHAR(32) NOT NULL,
  `Last Name` VARCHAR(32) NULL,
  `DOB` DATE NOT NULL,
  `SSN` CHAR(11) NULL,
  `Phone Number` CHAR(13) NULL,
  `Card Number` VARCHAR(16) NULL
) AS
SELECT
  `First Name`,
  `Last Name`,
  `DOB`,
  mask_ssn(`SSN`,'x') AS `SSN`,
  mask_inner(`Phone Number`,5,5,'*') AS `Phone Number`,
  CASE
    WHEN `Card Number` IS NULL THEN NULL
    ELSE gen_rnd_pan(16)
  END AS `Card Number`
FROM
  `test`.`persons`;

To confirm that the statement worked as expected:

Copied to Clipboard
Error: Could not Copy
Copied to Clipboard
Error: Could not Copy
mysql> SELECT
    ->   `Original`.`First Name`,
    ->   `Original`.`Last Name`,
    ->   `Original`.`DOB`,
    ->   `Original`.`SSN`,
    ->   `New`.`SSN` AS `New SSN`,
    ->   `Original`.`Phone Number`,
    ->   `New`.`Phone Number` AS `New Phone Number`,
    ->   `Original`.`Card Number`,
    ->   `New`.`Card Number` AS `New Card Number`
    -> FROM
    ->     `test`.`persons` `Original`
    ->   INNER JOIN
    ->     `test`.`persons2` `New`
    ->   ON
    ->     `Original`.`First Name` = `New`.`First Name` AND
    ->     `Original`.`Last Name` = `New`.`Last Name` AND
    ->     `Original`.`DOB` = `New`.`DOB`;
+------------+------------+------------+-------------+-------------+---------------+------------------+------------------+------------------+
| First Name | Last Name  | DOB        | SSN         | New SSN     | Phone Number  | New Phone Number | Card Number      | New Card Number  |
+------------+------------+------------+-------------+-------------+---------------+------------------+------------------+------------------+
| George     | Washington | 1732-02-22 | NULL        | NULL        | (804)555-0101 | (804)***-0101    | 4716958491933491 | 5343770563021230 |
| John       | Adams      | 1735-10-30 | 756-16-0134 | xxx-xx-0134 | (508)555-0102 | (508)***-0102    | 5548026033569479 | 1202657013173118 |
| Thomas     | Jefferson  | 1743-04-13 | 647-42-4787 | xxx-xx-4787 | NULL          | NULL             | 341605099040689  | 3776752473350728 |
| James      | Madison    | 1751-03-16 | 939-87-6230 | xxx-xx-6230 | (276)555-0104 | (276)***-0104    | 6223050740560951 | 5875320742028874 |
| James      | Monroe     | 1758-04-28 | 615-50-9245 | xxx-xx-9245 | (571)555-0105 | (571)***-0105    | 4916285199808116 | 3337533857432262 |
| John       | Adams      | 1767-07-11 | 677-54-5716 | xxx-xx-5716 | (617)555-0106 | (617)***-0106    | NULL             | NULL             |
| Andrew     | Jackson    | 1767-03-15 | 302-80-6308 | xxx-xx-6308 | (803)555-0107 | (803)***-0107    | 370325552763685  | 4201660821245511 |
| Martin     | Van Buren  | 1782-12-05 | 133-85-9284 | xxx-xx-9284 | (329)555-0108 | (329)***-0108    | 4024007112991004 | 7325744713583041 |
| William    | Harrison   | 1773-02-09 | 397-35-5070 | xxx-xx-5070 | (757)555-0109 | (757)***-0109    | 5535714026624691 | 5775203611674070 |
| John       | Tyler      | 1790-03-29 | 037-67-9069 | xxx-xx-9069 | (804)555-0110 | (804)***-0110    | 374077354201234  | 4040166113407848 |
+------------+------------+------------+-------------+-------------+---------------+------------------+------------------+------------------+
10 rows in set (0.00 sec)

 

Data Masking and De-Identification in INSERT statements

 

The MySQL data masking and de-identification functions work perfectly fine with INSERT statements, as seen below:

Copied to Clipboard
Error: Could not Copy
Copied to Clipboard
Error: Could not Copy
INSERT INTO `test`.`persons`
(
  `First Name`,
  `Last Name`,
  `DOB`,
  `SSN`,
  `Phone Number`,
  `Card Number`
)
VALUES
( 'James',
  'Polk',
  '1795-11-02',
  mask_ssn('196-15-3788'),
  '(919)555-0111',
  '5355367287235827'
);

To test this:

Copied to Clipboard
Error: Could not Copy
Copied to Clipboard
Error: Could not Copy
mysql> SELECT * FROM `test`.`persons` WHERE `First Name` = 'James' AND `Last Name` = 'Polk';
+----+------------+-----------+------------+-------------+---------------+------------------+
| id | First Name | Last Name | DOB        | SSN         | Phone Number  | Card Number      |
+----+------------+-----------+------------+-------------+---------------+------------------+
| 11 | James      | Polk      | 1795-11-02 | XXX-XX-3788 | (919)555-0111 | 5355367287235827 |
+----+------------+-----------+------------+-------------+---------------+------------------+
1 row in set (0.00 sec)

 

Data Masking and De-Identification in UPDATE statements

 

Now, an example of an UPDATE statement that modifies the row inserted in the previous example:

Copied to Clipboard
Error: Could not Copy
Copied to Clipboard
Error: Could not Copy
UPDATE
  `test`.`persons`
SET
  `Phone Number` = mask_inner(`Phone Number`,5,5,'*') AND
  `Card Number` = gen_rnd_pan(16)
WHERE
  `First Name` = 'James' AND
  `Last Name` = 'Polk';

And, we test it:

Copied to Clipboard
Error: Could not Copy
Copied to Clipboard
Error: Could not Copy
mysql> SELECT * FROM `test`.`persons` WHERE `First Name` = 'James' AND `Last Name` = 'Polk';
+----+------------+-----------+------------+-------------+---------------+------------------+
| id | First Name | Last Name | DOB        | SSN         | Phone Number  | Card Number      |
+----+------------+-----------+------------+-------------+---------------+------------------+
| 11 | James      | Polk      | 1795-11-02 | XXX-XX-3788 | (919)***-0111 | 7610524457065262 |
+----+------------+-----------+------------+-------------+---------------+------------------+
1 row in set (0.00 sec)

 

Implementing Data Masking and De-Identification Transparently

 

Ideally, data masking and de-identification should be applied at the database level in a way that is transparent to the application.  Unfortunately, MySQL does not natively implement these features transparently.  It is up the database administrator and/or developer to figure out a way to implement it.  Below are examples of methods that can help.

 

Triggers

MySQL supports triggers for INSERT, UPDATE, and DELETE operations.  In our scenario, we probably don't need to implement DELETE triggers.  MySQL INSERT triggers are applicable to INSERT, LOAD DATA, and REPLACE statements.  As a side note, UPDATE triggers, which are applicable to UPDATE statements, do not apply to the REPLACE statement because this statement is a combination of INSERT and DELETE operations, and not a derivative of the UPDATE statement.

Embedding data masking and de-identification operations in triggers is a great way to implement the functionality transparently for an application.  Below are examples of INSERT and UPDATE triggers.  The UPDATE trigger uses IF clauses that verify whether the incoming data is already masked.   This validation is probably superflous in our very simple example where further masking already masked values would not make any difference, but I present it anyway because it is a good concern to have in mind when implemeting an UPDATE trigger that masks data.

Copied to Clipboard
Error: Could not Copy
Copied to Clipboard
Error: Could not Copy
DELIMITER //

CREATE TRIGGER `test`.`persons_insert` BEFORE INSERT ON `test`.`persons` FOR EACH ROW
  BEGIN
    SET NEW.`SSN` = mask_ssn(NEW.`SSN`);
    SET NEW.`Phone Number` = mask_inner(NEW.`Phone Number`,5,5,'*');
    SET NEW.`Card Number` = gen_rnd_pan(16);
  END; //

CREATE TRIGGER `test`.`persons_update` BEFORE UPDATE ON `test`.`persons` FOR EACH ROW
  BEGIN
        IF LEFT(NEW.`SSN`,1) <> '*' THEN
          SET NEW.`SSN` = mask_ssn(NEW.`SSN`);
        END IF;
        IF SUBSTRING(NEW.`Phone Number`,5,3) <> '***' THEN
          SET NEW.`Phone Number` = mask_inner(NEW.`Phone Number`,5,5,'*');
        END IF;
        SET NEW.`Card Number` = gen_rnd_pan(16);
  END; //

DELIMITER ;

Now, some INSERT and UPDATE statements to test these triggers:

Copied to Clipboard
Error: Could not Copy
Copied to Clipboard
Error: Could not Copy
INSERT INTO `test`.`persons`
(
  `First Name`,
  `Last Name`,
  `DOB`,
  `SSN`,
  `Phone Number`,
  `Card Number`
)
VALUES
(
  'Zachary',
  'Taylor',
  '1784-11-24',
  mask_ssn('948-15-3788'),
  '(919)555-0112',
  '1234567890ABCDEF'
);

UPDATE
  `test`.`persons`
SET
  `SSN` = NULL,
  `Phone Number` = '(123)456-7890',
  `Card Number` = '1234567890ABCDEF'
WHERE
  `First Name` = 'James' AND
  `Last Name` = 'Polk';

Verify the results:

Copied to Clipboard
Error: Could not Copy
Copied to Clipboard
Error: Could not Copy
mysql> SELECT * FROM `test`.`persons` WHERE `Last Name` IN ('Polk', 'Taylor');
+----+------------+-----------+------------+-------------+---------------+------------------+
| id | First Name | Last Name | DOB        | SSN         | Phone Number  | Card Number      |
+----+------------+-----------+------------+-------------+---------------+------------------+
| 11 | James      | Polk      | 1795-11-02 | NULL        | (123)***-7890 | 4120428138834320 |
| 12 | Zachary    | Taylor    | 1784-11-24 | XXX-XX-3788 | (919)***-0112 | 8287051411720113 |
+----+------------+-----------+------------+-------------+---------------+------------------+
2 rows in set (0.00 sec)

 

SELECT statements

 

Transparently masking or de-identifying data for SELECT statements is a bigger challenge, which can be surmounted, albeit not in a bullet-proof way, by aid of the MySQL Query Rewriter.  The idea is to configure MySQL to modify incoming SELECT statements to use masking and de-identification functions in the projection list and elsewhere, if needed.

The MySQL Query Rewriter must first be installed as follows.  Command specifics may vary depending on the MySQL installation (especially the path).

Copied to Clipboard
Error: Could not Copy
Copied to Clipboard
Error: Could not Copy
mysql -uroot -p < /usr/share/mysql-8.4/install_rewriter.sql

Once installed, the Query Rewriter must be enabled:

Copied to Clipboard
Error: Could not Copy
Copied to Clipboard
Error: Could not Copy
SET GLOBAL rewriter_enabled = ON;

With the Query Rewriter now installed and enabled, a rule can thus be created to modify incoming INSERT statements on the table persons:

Copied to Clipboard
Error: Could not Copy
Copied to Clipboard
Error: Could not Copy
INSERT INTO `query_rewrite`.`rewrite_rules`(`pattern`, `replacement`) VALUES
(
  'SELECT `First Name`, `Last Name`, `DOB`, `SSN`, `Phone Number`, `Card Number` FROM `test`.`persons`',
  'SELECT `First Name`, `Last Name`, `DOB`, mask_ssn(`SSN`) AS `SSN`,`Phone Number`, gen_rnd_pan(16) AS `Card Number` FROM `test`.`persons`'
);

The new rule(s) must be flushed for the query digests to be created and the rules to be enabled:

Copied to Clipboard
Error: Could not Copy
Copied to Clipboard
Error: Could not Copy
CALL query_rewrite.flush_rewrite_rules();

Now, to test our rule:

Copied to Clipboard
Error: Could not Copy
Copied to Clipboard
Error: Could not Copy
mysql> SELECT `First Name`, `Last Name`, `DOB`, `SSN`, `Phone Number`, `Card Number` FROM `test`.`persons`;
+------------+------------+------------+-------------+---------------+------------------+
| First Name | Last Name  | DOB        | SSN         | Phone Number  | Card Number      |
+------------+------------+------------+-------------+---------------+------------------+
| George     | Washington | 1732-02-22 | NULL        | (804)555-0101 | 6762566676671220 |
| John       | Adams      | 1735-10-30 | XXX-XX-0134 | (508)555-0102 | 4505312582063302 |
| Thomas     | Jefferson  | 1743-04-13 | XXX-XX-4787 | NULL          | 6433845517288264 |
| James      | Madison    | 1751-03-16 | XXX-XX-6230 | (276)555-0104 | 3345803412061488 |
| James      | Monroe     | 1758-04-28 | XXX-XX-9245 | (571)555-0105 | 6668688581256346 |
| John       | Adams      | 1767-07-11 | XXX-XX-5716 | (617)555-0106 | 0357680107216358 |
| Andrew     | Jackson    | 1767-03-15 | XXX-XX-6308 | (803)555-0107 | 0540308164535431 |
| Martin     | Van Buren  | 1782-12-05 | XXX-XX-9284 | (329)555-0108 | 7864047134683383 |
| William    | Harrison   | 1773-02-09 | XXX-XX-5070 | (757)555-0109 | 0813006343624320 |
| John       | Tyler      | 1790-03-29 | XXX-XX-9069 | (804)555-0109 | 1131026814832510 |
| James      | Polk       | 1795-11-02 | NULL        | (123)***-7890 | 5346414414473731 |
| Zachary    | Taylor     | 1784-11-24 | XXX-XX-3788 | (919)***-0111 | 3865052427700231 |
+------------+------------+------------+-------------+---------------+------------------+
12 rows in set, 1 warning (0.00 sec)

The data returned is as desired for a query that did not originally contain any of the data masking and de-identification functions.  The SSN field is masked and the Card Number field is randomly generated, as expected.

However, I mentioned earlier that this method is not bullet-proof.  This is because the MySQL Query Rewriter works by comparing the digests of the patterns in the rules with the digests of incoming statements to find matches.  Any query that is not exactly structured the same way as are the patterns in the rules will have a different digest.  Therefore, the MySQL Query Rewriter, while great at transforming known queries as are those normally generated by enterprise applications, would not help against ad-hoc queries.  Here is an example in which I simply omit field First Name in the projection list.  We immediately observe that the SSN values are not masked.

Copied to Clipboard
Error: Could not Copy
Copied to Clipboard
Error: Could not Copy
mysql> SELECT `Last Name`, `DOB`, `SSN`, `Phone Number`, `Card Number` FROM `test`.`persons`;
+------------+------------+-------------+---------------+------------------+
| Last Name  | DOB        | SSN         | Phone Number  | Card Number      |
+------------+------------+-------------+---------------+------------------+
| Washington | 1732-02-22 | NULL        | (804)555-0101 | 4716958491933491 |
| Adams      | 1735-10-30 | 756-16-0134 | (508)555-0102 | 5548026033569479 |
| Jefferson  | 1743-04-13 | 647-42-4787 | NULL          | 341605099040689  |
| Madison    | 1751-03-16 | 939-87-6230 | (276)555-0104 | 6223050740560951 |
| Monroe     | 1758-04-28 | 615-50-9245 | (571)555-0105 | 4916285199808116 |
| Adams      | 1767-07-11 | 677-54-5716 | (617)555-0106 | NULL             |
| Jackson    | 1767-03-15 | 302-80-6308 | (803)555-0107 | 370325552763685  |
| Van Buren  | 1782-12-05 | 133-85-9284 | (329)555-0108 | 4024007112991004 |
| Harrison   | 1773-02-09 | 397-35-5070 | (757)555-0109 | 5535714026624691 |
| Tyler      | 1790-03-29 | 037-67-9069 | (804)555-0109 | 374077354201234  |
| Polk       | 1795-11-02 | NULL        | (123)***-7890 | 4120428138834320 |
| Taylor     | 1784-11-24 | XXX-XX-3788 | (919)***-0111 | 6715143436862560 |
+------------+------------+-------------+---------------+------------------+
12 rows in set (0.00 sec)

 

Conclusion

 

We saw in this article how to implement data masking and de-identification with MySQL Enterprise Edition.  In my opinion, the best part is that these operations can be done transparently for the application, as was demonstrated in the article.

These techniques modify the data, usually for the purpose of privacy, in a way that the original data cannot be recovered.  MySQL Enterprise Edition also supports a data obfuscation technique, which modifies the data in a way that can be used for data de-identification and allows for the reconstruction of the original data.  We will explore this technique in another article.

As always, please let me know of any errors or glaring omissions.

 

References

 

 

 

 

Nicolas De Rico

MySQL Solutions Engineer

Nicolas is a MySQL solutions architect based in the San Francisco Bay Area since 2015.


Previous Post

Troubleshooting MySQL HeatWave Via The Error Log Table And Audit Log

Tony Darnell | 7 min read

Next Post


Oracle Unveils HeatWave Innovations Across Generative AI, Lakehouse, MySQL, AutoML, and Multicloud

Nipun Agarwal | 10 min read
Oracle Chatbot
Disconnected