MySQL Shellで行うマルチスレッド処理によるデータダンプ

この記事はMultithreaded Data Dumps With MySQL Shellの翻訳版です。

MySQL Shellは、MySQLインスタンスに接続して管理するのに非常に便利なコマンドラインインタフェースです。MySQL Shell のAdmin APIを使うと、特定のテーブル(またはテーブル内のデータのサブセット)のダンプや、1つまたは複数の完全なスキーマのダンプ、データベースインスタンス全体のダンプを取得し、必要に応じてこれらのダンプをリストアすることができます。さらにマルチスレッドで実行することで、処理効率を上げることができます。

注意: 本記事でご紹介するデモの実施には、MySQL Shellのインストールが必要です。

ダンプ/ロード方法

今までデータのダンプにはmysqldumpを使ってきた方が多いかと思いますが、MySQL Shellのオブジェクトのutilを使えば3通りの方法でデータをダンプすることができます。

util.dumpTables() ... 個々のテーブル、またはテーブル内のデータのサブセットのダンプ
util.dumpSchemas() ... 1つまたは複数の完全なスキーマのダンプ
util.dumpInstance() … データベースインスタンス全体のダンプ

テーブルデータをダンプする

より細かく指定してダンプを取得できるutil.dumpTables()を使ってみます。構文は下記のようになります。

util.dumpTables(schema, tables, outputUrl[, options])

schemaにはダンプしたいテーブルが存在するスキーマ、tablesにはダンプしたいテーブル名、outputUrlにはダンプファイルをエクスポートしたいパス、optionsにはダンプ取得に関するオプションを記述します。

util.dumpTables('sakila', ['actor'], '~/sakila-table-dump')

上記の例では、sakilaというスキーマのactorテーブルの全行をダンプし、ユーザのrootディレクトリのsakila-table-dumpという名前のディレクトリにエクスポートします。

実行すると、コンソールにダンプの情報が表示されます。

Acquiring global read lock
Global read lock acquired
Initializing - done
1 tables and 0 views will be dumped.
Gathering information - done
All transactions have been started
Locking instance for backup
Global read lock has been released
Writing global DDL files
Running data dump using 4 threads.
NOTE: Progress information uses estimated values and may not be accurate.
Writing schema metadata - done      
Writing DDL - done      
Writing table metadata - done      
Starting data dump
100% (200 rows / ~200 rows), 0.00 rows/s, 0.00 B/s uncompressed, 0.00 B/s compressed
Dump duration: 00:00:00s                                                           
Total duration: 00:00:00s                                                          
Schemas dumped: 1                                                                  
Tables dumped: 1                                                                   
Uncompressed data size: 7.40 KB                                                    
Compressed data size: 1.91 KB                                                      
Compression ratio: 3.9                                                             
Rows written: 200                                                                  
Bytes written: 1.91 KB                                                             
Average uncompressed throughput: 7.40 KB/s                                         
Average compressed throughput: 1.91 KB/s

ダンプをエクスポートしたディレクトリには、下記のようなファイルが格納されます。

デフォルトの設定では、MySQL Shellは4スレッドでダンプ取得を実行します。8スレッドで実行するには下記のように指定します。

util.dumpTables('sakila', ['actor'], '~/sakila-table-dump', {threads: 8})

util.dumpTablesでもう一つ便利なオプションは、ダンプ対象とするデータを指定することです。このフィルタリングには、JSONオブジェクトのwhereプロパティをオプションに使います。下の例では、filmテーブルからタイトルが'A'で始まる映画のみをダンプしています。

util.dumpTables('sakila', ['film'], '~/sakila-table-dump', {where: {"sakila.film":"title like 'A%'"}})

取得したダンプの格納先として指定する知れクトリは空である必要があります。もし空でなければ、下記のようなエラーが得られます。

Util.dumpTables: Cannot proceed with the dump, the specified directory '~/sakila-table-dump' already exists at the target location ~/sakila-table-dump and is not empty. (ArgumentError)

`テーブルデータをロードする`

util.dumpTables(), util.dumpSchemas(), util.dumpInstance() のいずれを使った場合でも、ダンプしたデータをデータベースにロードするにはutil.loadDump()を使用します。渡す引数は、使用したダンプ方法によって異なる場合があります。

先ほど取得したデータをロードするには、下記のようにします。

util.loadDump('~/sakila-table-dump', {schema: 'sakila-2', threads:8})

上記の例では、データをsakila-2という名前の新しいスキーマにロードする処理を、8スレッドで実行します。指定されたスキーマがない場合には、データのロード時に作成します。指定されたスキーマに同名のテーブルが存在する場合には、下記のようなメッセージが表示されます。

Loading DDL and Data from '~/sakila-table-dump' using 8 threads. Opening dump... Target is MySQL 8.0.33. Dump was produced from MySQL 8.0.33 Scanning metadata - done Checking for pre-existing objects... ERROR: Schema `sakila-2` already contains a table named film ERROR: One or more objects in the dump already exist in the destination database. You must either DROP these objects or exclude them from the load. Util.loadDump: While 'Scanning metadata': Duplicate objects found in destination database (MYSQLSH 53021)

`スキーマをダンプする`

util.dumpSchemas()を使うことで、一つ以上のスキーマを同時にダンプすることが可能です。引数は先ほど使用したのと同様です。

util.dumpSchemas(schemas, outputUrl[, options])

schemasにはダンプを取りたいスキーマ名を一つ以上、outpurUrlにはダンプを出力するディレクトリパス、optionsにはダンプ取得に関するオプションを記述します。

下記では、sakilaスキーマ全体をダンプします。

util.dumpSchemas(['sakila'], '~/sakila-schema-dump', {threads: 8})

上記のダンプ取得が完了すると、下記のようなメセージが表示されます。

Acquiring global read lock
Global read lock acquired
Initializing - done
1 schemas will be dumped and within them 16 tables, 7 views, 6 routines, 6 triggers.
Gathering information - done
All transactions have been started
Locking instance for backup
Global read lock has been released
Writing global DDL files
Running data dump using 8 threads.
NOTE: Progress information uses estimated values and may not be accurate.
Writing schema metadata - done      
Writing DDL - done        
Writing table metadata - done        
Starting data dump
101% (47.27K rows / ~46.41K rows), 0.00 rows/s, 0.00 B/s uncompressed, 0.00 B/s compressed
Dump duration: 00:00:00s                                                                 
Total duration: 00:00:00s                                                                
Schemas dumped: 1                                                                        
Tables dumped: 16                                                                        
Uncompressed data size: 3.03 MB                                                          
Compressed data size: 715.34 KB                                                          
Compression ratio: 4.2                                                                   
Rows written: 47268                                                                      
Bytes written: 715.34 KB                                                                 
Average uncompressed throughput: 3.03 MB/s                                               
Average compressed throughput: 715.34 KB/s

`スキーマをロードする`

ダンプしたスキーマを新しいスキーマにロードするには、util.loadDump()を使用します。

util.loadDump('~/sakila-schema-dump', {schema: 'sakila-3', threads: 8})

上記の例では、元のスキーマと同じMySQLインスタンスにダンプをロードしています。別のインスタンスにロードするには、{schema：'sakila-3'}を省略することで元のスキーマと同じ名前のスキーマにデータをロードします。

`複数のスキーマをダンプする`

複数のスキーマをダンプするには、下記のようにします。

util.dumpSchemas(['sakila', 'test-schema'], '~/multi-schema-dump', {threasds: 8})

上記を実行すると、次のようなメッセージが表示されます。

Acquiring global read lock
Global read lock acquired
Initializing - done
2 schemas will be dumped and within them 18 tables, 7 views, 6 routines, 6 triggers.
Gathering information - done
All transactions have been started
Locking instance for backup
Global read lock has been released
Writing global DDL files
Running data dump using 8 threads.
NOTE: Progress information uses estimated values and may not be accurate.
Writing schema metadata - done      
Writing DDL - done        
Writing table metadata - done        
Starting data dump
8 thds chunking, 0 dumping - 0% (0 rows / ~46.41K rows), 0.00 rows/s, 0.00 B/s uncompressed, 0.00 B/s co6 thds chunking, 0 dumping - 101% (47.27K rows / ~46.41K rows), 0.00 rows/s, 0.00 B/s uncompressed, 0.00                                                                                                        101% (47.27K rows / ~46.41K rows), 0.00 rows/s, 0.00 B/s uncompressed, 0.00 B/s compressed
Dump duration: 00:00:01s                                                                 
Total duration: 00:00:01s                                                                
Schemas dumped: 2                                                                        
Tables dumped: 18                                                                        
Uncompressed data size: 3.03 MB                                                          
Compressed data size: 715.89 KB                                                          
Compression ratio: 4.2                                                                   
Rows written: 47270                                                                      
Bytes written: 715.89 KB                                                                 
Average uncompressed throughput: 1.81 MB/s                                               
Average compressed throughput: 427.61 KB/s

`複数のスキーマをロードする`

ダンプした複数のスキーマをロードするには、下記のようにします。

util.loadDump('~/multi-schema-dump')

スキーマのオプションは、単一のスキーマからデータをロードする場合にのみ利用可能です。また、ロードしたいインスタンスに同じ名前のスキーマやスキーマ内のテーブルが無いことを確認してください。スキーマ名やテーブル名に重複がある場合、エラーメッセージが返されます。ロードが完了すると、下記のようなメッセージが表示されます。

Opening dump...
Target is MySQL 8.0.33. Dump was produced from MySQL 8.0.33
NOTE: Load progress file detected. Load will be resumed from where it was left, assuming no external updates were made.
You may enable the 'resetProgress' option to discard progress for this MySQL instance and force it to be completely reloaded.
Scanning metadata - done      
Executing common preamble SQL
NOTE: [Worker003] Error processing table `sakila`.`staff`, will retry after delay: MySQL Error 1213 (40001): Deadlock found when trying to get lock; try restarting transaction
Executing DDL - done        
Executing view DDL - done      
Starting data load
1 thds indexing / 100% (3.03 MB / 3.03 MB), 0.00 B/s, 63 / 63 tables and partitions done
Executing common postamble SQL                                                         
Recreating indexes - done      
63 chunks (47.27K rows, 3.03 MB) for 18 tables in 2 schemas were loaded in 1 sec (avg throughput 3.03 MB/s)
0 warnings were reported during the load.

`インスタンスをダンプする`

MySQLインスタンスにある全スキーマをダンプするには、util.dumpSchemas()で全てのスキーマを呼び出すことでも可能ですが、util.dumpInstance()を使用すればさらに簡単にできます。util.dumpInstance()はinformation_schemaやperformance_schemaなどのMySQLシステムテーブルを除く、インスタンス上のすべてのスキーマをダンプします。

インスタンスにある全スキーマをダンプするには、下記のようにします。

util.dumpInstance('~/instance-dump', {threads:8})

上記を実行すると下記のようなメッセージが表示されます。

Acquiring global read lock
Global read lock acquired
Initializing - done
9 out of 13 schemas will be dumped and within them 88 tables, 14 views, 14 routines, 12 triggers.
5 out of 8 users will be dumped.
Gathering information - done
All transactions have been started
Locking instance for backup
Global read lock has been released
Writing global DDL files
Writing users DDL
Running data dump using 8 threads.
NOTE: Progress information uses estimated values and may not be accurate.
Writing schema metadata - done      
Writing DDL - done          
Writing table metadata - done        
Starting data dump
3 thds dumping - 78% (384.72K rows / ~488.91K rows), 0.00 rows/s, 0.00 B/s uncompressed, 0.00 B/s compre                                                                                                       
100% (493.67K rows / ~488.91K rows), 0.00 rows/s, 0.00 B/s uncompressed, 0.00 B/s compressed
Dump duration: 00:00:03s                                                                   
Total duration: 00:00:03s                                                                  
Schemas dumped: 9                                                                          
Tables dumped: 88                                                                          
Uncompressed data size: 89.97 MB                                                           
Compressed data size: 12.54 MB                                                             
Compression ratio: 7.2                                                                     
Rows written: 493666                                                                       
Bytes written: 12.54 MB                                                                    
Average uncompressed throughput: 24.97 MB/s                                                
Average compressed throughput: 3.48 MB/s

上記で取得したダンプを他のMySQLインスタンスにロードするには、下記のコマンドを実行します。

util.loadDump('~/instance-dump', {threads: 8})

上記のコマンドを実行すると、次のようなメッセージが表示されます。

Loading DDL and Data from '~/instance-dump' using 8 threads.
Opening dump...
Target is MySQL 8.0.33. Dump was produced from MySQL 8.0.33
NOTE: Load progress file detected. Load will be resumed from where it was left, assuming no external updates were made.
You may enable the 'resetProgress' option to discard progress for this MySQL instance and force it to be completely reloaded.
Scanning metadata - done      
Executing common preamble SQL
NOTE: [Worker005] Error processing table `sakila-3`.`staff`, will retry after delay: MySQL Error 1213 (40001): Deadlock found when trying to get lock; try restarting transaction
Executing DDL - done          
Executing view DDL - done      
Starting data load
2 thds loading - 3 thds indexing / 99% (89.97 MB / 89.97 MB), 29.66 MB/s, 131 / 133 tables and partition
Executing common postamble SQL                                                             
100% (89.97 MB / 89.97 MB), 29.66 MB/s, 133 / 133 tables and partitions done               
Recreating indexes - done
136 chunks (493.67K rows, 89.97 MB) for 88 tables in 9 schemas were loaded in 4 sec (avg throughput 26.31 MB/s)
0 warnings were reported during the load.

`ダンプ/ロードオプション`

本記事でご紹介した以外にも、データのダンプとロードをさらに便利にするオプションがあります。例えば、Oracle Cloud Infrastructure（OCI）ストレージバケットや、Amazon AWS S3互換サービス、Microsoft Azure Blob Storageに直接データをダンプおよびロードできるオプションなどです。本記事でご紹介しきれなかったオプションに関しては、下記のWebマニュアルでご紹介しています。

`総括`

マルチスレッド処理によってMySQL Shellはmysqldumpよりも高速にデータをダンプおよびロードすることが可能です。データダンプユーティリティとロードユーティリティは、対象のデータをオプションによって細かく設定できます。これらのオプションにより、複数のクラウドプラットフォームへのデータダンプや、データロードが可能になります。

MySQL Shellで行うマルチスレッド処理によるデータダンプ

ダンプ/ロード方法

テーブルデータをダンプする

`テーブルデータをロードする`

`スキーマをダンプする`

`スキーマをロードする`

`複数のスキーマをダンプする`

`複数のスキーマをロードする`

`インスタンスをダンプする`

`ダンプ/ロードオプション`

`総括`

Scott Stroz

MySQL Developer Advocate

MySQL Shellを用いたAWSからOCIへのMySQLデータベースの移転

◆OPNパートナー様限定 MySQL HeatWave Lakehouse リリースイベント開催のご案内 2023年7月19日

MySQL Shellで行うマルチスレッド処理によるデータダンプ

ダンプ/ロード方法

テーブルデータをダンプする

テーブルデータをロードする

スキーマをダンプする

スキーマをロードする

複数のスキーマをダンプする

複数のスキーマをロードする

インスタンスをダンプする

ダンプ/ロードオプション

総括

Authors

Scott Stroz

MySQL Developer Advocate

MySQL Shellを用いたAWSからOCIへのMySQLデータベースの移転

◆OPNパートナー様限定 MySQL HeatWave Lakehouse リリースイベント開催のご案内 2023年7月19日

`テーブルデータをロードする`

`スキーマをダンプする`

`スキーマをロードする`

`複数のスキーマをダンプする`

`複数のスキーマをロードする`

`インスタンスをダンプする`

`ダンプ/ロードオプション`

`総括`