How much DataMemory+IndexMemory do you need for disk data?

One thing we were guessing at with Massimo yesterday is, if you store
large blobs as disk data, how much will they consume DataMemory and
IndexMemory (primary key, each "chunk" has a hidden primary key, first 25x bytes
of blob stored in memory...)?

My empirical test showed that about 2% of the total size of blobs is needed for RAM
(25% of that is IndexMemory).

IMHO this is close to negligible, but in many situations not negligible
at all (may have close to TB of disk data -> 20GB of RAM needed for
disk data).

Also note that this is a minimum figure. If you actually have something
else than the blob (like other indexes) you of course use much more RAM.

The test was:
CREATE TABLE `jpgtest` (
`id` int(11) NOT NULL,
`jpg` blob,
PRIMARY KEY (`id`)
) TABLESPACE ts_1 STORAGE DISK ENGINE=ndbcluster;

and inserting 100k blobs into that table (7+ GB in total).

Details below.

PS: Note that Johan just posted several excellent posts on using MySQL Cluster disk based data:
http://johanandersson.blogspot.com/2008/12/disk-data-summary.html
http://johanandersson.blogspot.com/2008/12/disk-data-counters-more.html
http://johanandersson.blogspot.com/2008/11/disk-data-counters.html

\*\*\*\*\*\* Loading 100k files as blobs into an NDB disk data table. \*\*\*\*\*
(Simple test, one datafile, one insert thread, etc...)

GRANT ALL ON \*.\* TO 'root'@'';
-- tablespace and undo log
CREATE LOGFILE GROUP lg_1
ADD UNDOFILE 'undo_1.dat'
ENGINE NDB;

CREATE TABLESPACE ts_1
ADD DATAFILE 'data_1.dat'
USE LOGFILE GROUP lg_1
INITIAL_SIZE 10G
ENGINE NDB;

use test;

CREATE TABLE `jpgtest` (
`id` int(11) NOT NULL,
`jpg` blob,
PRIMARY KEY (`id`)
) TABLESPACE ts_1 STORAGE DISK ENGINE=ndbcluster;

-bash-3.2$ cat loadpics.pl
#!/usr/bin/perl

use DBI;

# MySQL CONFIG VARIABLES
$hostname = "ndb05";
$database = "test";
$tablename = "jpgtest";
$user = "root";
$pw = "";

$dsn = "DBI:mysql:database=$database;host=$hostname;port=3306";

$dbh = DBI->connect($dsn, $user, $pw);
$drh = DBI->install_driver("mysql");

$n = 100000;
open FH, 'fakepic.jpg';
$jpg = ;

$i=0;
while(true)
{
$i++;

$sth = $dbh->prepare("INSERT INTO jpgtest VALUES (?, ?);");
$sth->bind_param(1, $i, {TYPE => SQL_INTEGER});
$sth->bind_param(2, $jpg, {TYPE => SQL_BLOB});
$sth->execute;
print "$i\\n";
}

-bash-3.2$ ls -lFh
total 108K
-rw-r--r-- 1 hingo hingo 100K 2008-12-09 16:26 fakepic.jpg
-rw-r--r-- 1 hingo hingo 634 2008-12-09 20:48 loadpics.pl

\*\*\*\*\*\*\*\*\*\*\*\*\*\*
Load speed:
real 24m36.396s
user 1m24.002s
sys 0m13.382s

mysql> select count(\*) from jpgtest;
77831

77831 records
1476,4 seconds
52,72 rows/sec <<<<<<<<<<<
102404 bytes/row
5398420,02 bytes/sec
5,15 MB/sec <<<<<<<<<<

\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*
Data usage:
7970205724 bytes total
7,42 GB total (data inserted) <<<<<<<<<<<

[root@ndb05 mysql]# ls -laiFh /data1/mysqlcluster/ndb_2_fs
total 11G
21200937 -rw-r--r-- 1 root sroot 11G 2008-12-10 12:09 data_1.dat
21200936 -rw-r--r-- 1 root sroot 128M 2008-12-10 12:09 undo_1.dat
(The above means nothing, the sizes are as specified when created. However, it is interesting to note that 10GB in MySQL becomes 11GB in the filesystem...)

(This query from Johan's blog)
mysql> select free_extents, total_extents from information_schema.files where file_type='datafile';
+--------------+---------------+
| free_extents | total_extents |
+--------------+---------------+
| 5284 | 10240 |
| 5284 | 10240 |
+--------------+---------------+
(Interesting... Why are more than half of my extents still free,
even if I inserted 7 GB into a 10 GB file? Something in this is not right...)

-bash-3.2$ tail ndb_1_cluster.log
2008-12-10 12:11:40 [MgmSrvr] INFO -- Node 3: Data usage is 11%(3704 32K pages of total 32768)
2008-12-10 12:11:40 [MgmSrvr] INFO -- Node 3: Index usage is 28%(4725 8K pages of total 16416)

DataMemory
3704 pages
32 KB/page
118528 KB
115,75 MB

IndexMemory
4725 pages
8 KB/page
37800 KB
36,91 MB

RAM vs datafile ratio:
7,42 GB of largish blobs will use about
152,66 MB of RAM (indexes, hidden "cunk" indexes, beginning of each blob is in RAM...)
Conclusion: Allocate
2,01 %
of the size of your disk data blobs for RAM!

24,18 % of that is IndexMemory

Comments:

Post a Comment:
Comments are closed for this entry.
About

The people of the MySQL Telecom team writes about developments around MySQL and MySQL Cluster and how these products are used by our Communitcations industry customers. (Image jasmic@Flickr)

Search

Archives
« July 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
  
       
Today