TechTip: Possible savings from SMP Aware Compression Utilities

Mapping possible Space and Time savings by using SMP aware (threaded) compression Utilities ?


Content

  1. Mapping possible Space and Time savings by using SMP aware (threaded) …
    1. Content
    2. Role of Archivers and Compression in our Testing
    3. Can we use power of SMP for Space and Time savings ?
      1. I use this summary archivers pages to find not SMP aware compressors for …
    4. Parallelized compressor implementations
    5. Well Known Compressor and Archiver implementations for comparison
    6. Speed and Size test results
      1. Test result from ltore (FLAR cpio based) backup / restore used for Solaris
      2. Test result from R2 (disk partitions based) backup / restore used for …
      3. Test result analysis
        1. Parallel compressions implementations
        2. PIPE and kernel clock optimizations in R2
      4. Contacts:

Role of Archivers and Compression in our Testing

During our QA testing we are using several imaging (ability to backup and restore) OS with installed program like JES and relevant test tools.

We use various archivers / compressors to save space on our image library server, but new versions of OSes are larger and larger so there i coming to mind questions if we can save more space on images and/or if we can make backuping / restoring process faster.

Backuping and Restoring of images ocupy significant time in our testing proces and any time saving can significantly improve testing performance and flexibility.

Can we use power of SMP for Space and Time savings ?

  • Space savings: Both 2 main compressors gzip and bzip2 have tunable parameter for compression quality, however better take longer with limit of 1 processors power
  • Time savings: Most of our HW is relatively powerfully SMP dual processor machines (new are also dual core) with at least 4GB of memory

To able use power of SMP processor we need ports of which are multiprocessor or SMP aware , we need ports with pallalelizm - threads support

Some benchmark was done of 20+ processor HW (SunFire and Itanium) Parallel BZIP2 bechmarks PARALLEL DATA COMPRESSION WITH BZIP2 (PDF)

Ideally we can archive on dual processor machine near to 2x speed increasing and on multi-processor scale nearly linearly, this is very promising !

I use this summary archivers pages to find not SMP aware compressors for comparison:

List of file archivers Archives listing Comparison of file archivers

MaximumCompression site with test results for various file types using different compression tools Well-known Archive Comparison Test

Parallelized compressor implementations

LZOP lzop From Wikipedia the free encyclopedia

lzop main advantages over gzip are much higher compression and decompression speed (at the cost of some compression ratio).

LZOPSMP The world's fastest general purpose compression program is now even faster on multiple-CPU computers.

Portability: C (GCC) HIGH (Linux, AIX, Sun, DEC/Compaq/HP Tru64)

GZIP gzip From Wikipedia the free encyclopedia

PIGZ Parallel implementation of gzip pipe driven compressor for use on multiprocessor/multicore machines

pigz requires libz 1.2.3, will not work with lower versions !

Portability: C (GCC) HIGH (Linux, Solaris OK)

MGZIP Parallel implementation of gzip pipe driven compressor for use on multiprocessor/multicore machines

mgzip is a program that makes use of SMP machines and zlib to use as many processors as you have to quickly compress files into gzip compatible format.

mgzip is from 1993, it requires old libz 1.1.x headers to compile, but link against latest faster 1.2.x libz versions OK!

There is a bug in gzip 1.2.4 that you must fix before reliably uncompressing files made with mgzip. Using latest gzip is recommended

Portability: C (GCC) HIGH (Linux, Solaris OK)

No parrallell GZIP decompression available

BZIP2 bzip2 From Wikipedia the free encyclopedia

BZIP2SMP parallel implementation of bzip2 pipe driven compressor for use on multiprocessor/multicore machines

bzip2smp program incorporates the modified libbzip2 sources (part of bzip2). The sources have to be modified because it was not feasible to split the rle compression,

block sorting and bit-storing stages apart with the stock library design. This separation was merely hacked in -- to make it the clean way, the library has to be redesigned.

Portability: C (GCC) HIGH (Linux, Solaris OK)

PBZIP2 Parallel implementation of the bzip2 block-sorting file compressor that uses pthreads and achieves near-linear speedup on SMP machines

Unfortunately pbzip2, does not support compression from stdin (meaning no "tar | pbzip2"),

It does not produce the archives equal to the original bzip2 (although compatible, they are larger), but bz2 archives produces with pbzip2 can be also decompress in parrallell

Portability: C++ (GCC) GOOD (Linux, Solaris OK)

P7ZIP Command Line port of 7zip to to POSIX-conforming operating systems like Linux)

p7zip implement threaded bzip2 format in compression mode

Portability: C++ (GCC) GOOD (Linux, Solaris OK)

7-ZIP 7-Zip from Wikipedia the free encyclopedia

P7ZIP Command Line port of 7zip to to POSIX-conforming operating systems like Linux)

7z LZMA is deisgned and implemented originally as threaded in compression mode

Portability: C++ (GCC) GOOD (Linux, Solaris OK)

Well Known Compressor and Archiver implementations for comparison

COMPRESS compress from Wikipedia, the free encyclopedia

INFOZIP Info-ZIP from Wikipedia the free encyclopedia

Speed and Size test results

Test result from ltore (FLAR cpio based) backup / restore used for Solaris

To be able used standard JumpStart network install in Solaris install image compress was changes to respective decompress binary.

Solaris 10 u1 SPARC OS image with updates OS budled AS to JES5

Note: Our Solaris 10 u1 SPARC installation uses 3 833 931K of disk space and have 166 673 files.

Backup times

CompressorSizeTime
compress 2059M 08:11 min
gzip -1 1581M 07:24 min
pigz -1p4 1575M 06:34 min
gzip -9 1462M 41:37 min
pigz -9p4 1460M 24:04 min
bzip2 -1 1397M 31:24 min
bzip2smp -1p4 1397M 18:30 min
bzip2 -9 1310M 56:04 min
bzip2smp -9p4 1310M 30:33 min

Restore times

DecompressorCompressorSizeMethodTimeRestore OS
compress compress 2058.80M ltore 53.73 min sol9u8
compress compress 2058.80M ltore 15.68 min sol10u3
gzunzip compress 2058.80M ltore 16.07 min sol10u3_flar_gz
gzunzip pigz-1p4 1574.65M ltore 16.73 min sol10u3_flar_gz
gzunzip pigz-9p4 1459.71M ltore 16.40 min sol10u3_flar_gz
bunzip2 bzip2smp-1p4 1397.13M ltore 22.03 min sol10u3_flar_bgiz2
bunzip2 bzip2smp-9p4 1309.80M ltore 25.70 min sol10u3_flar_bgiz2

Test result from R2 (disk partitions based) backup / restore used for Linux, MS Windows

MS Windows 2000 AS server clean OS image

Backup times

CompressorSizeTime
gzip -1 1183M 29:10 min
pigz -1p4 1368M 18:22 min
bzip2 -1 1161M 44:11 min
bzip2smp -1p4 1160M 27:47 min

MS Windows 2003 EE 64bit server OS image with full JES4 (with coms)

Backup times

CompressorSizeTime
compress 15072M 19:23 min
lzop -1 7461M 14:56 min
lzopsmp -lT8 7452M 14:02 min
gzip bb -1 6659M 61:23 min
gzip gnu -1 6671M 38:51 min
pigz -1p4 6750M 24:50 min
pigz -1p4 OPT1 \* 6750M 18:59 min
pigz -1p4 OPT2 \* 6750M 17:45 min
mgzip -1 -t 4 6895M 14:11 min
bzip2 -1 6636M 92:21 min
bzip2smp -1p46636M 51:40 min
pbzip2 -1p4 \* 6625M local min
p7zip-bzip2 -x1 -mt4 6628M 51:32 min
infoZIP -1 6856M 20:24 min
p7zip-7z -x1 -mt1 6195M 66:26 min
p7zip-7z -x1 -mt4 6195M 65:35 min

Restore times

DecompressorCompressorSizeMethodTime
compress compress 15072MNET-NFS 17:59 min
lzoplzop7461MNET-NFS 11:09 min
gzip bb gzip bb 6659M NET-NFS 13:54 min
gzip gnu gzip gnu 6671M NET-NFS 12:19 min
gzip bb pigz 6750MNET-NFS 13:48 min
pbzip2 -p4 pbzip2 6625M NET-NFS 23:46 min
bzip2 bzip2 6636M NET-NFS 28:06 min
infoZIPinfoZIP6856MNET-NFS 14:47 min
p7zip-7zp7zip-7z6195MNET-NFS 23:20 min

\* See comments in result analysis sections

Test result analysis

Parallel compressions implementations

PIPE and kernel clock optimizations in R2

Linear grow of compression speed is ideal situation, when we use a pipe real speed depend on level of PIPE and network optimizations

OPT1: By pipe optimization it was possible to "pigz -1p4 24:50 min" to "pigz -1p4 18:59 min"

Pipe is in R2 backup process as following

dd bs=block_size_in_BM device | pv -B buffer_size_in_MB | pigz -p numbter_of_threads > path_image_on_nfs_srv

Best performance was achieved on x4100 when

  • dd block sise is 1MB
  • pv buffer size is 100MB
  • pigz is used nuber of processors \* 2

OPT2: Changing R2 kernel clock from 250Hz to 1000Hz increase seed from "pigz -1p4 18:59 min" to "pigz -1p4 17:45 min"

OPT3: In previous optimizations we increase NFS packet size to 8K for both read and write

Comments:

[Trackback] TechTip: Possible savings from SMP Aware Compression Utilities : VirtualGuru (tags: smp compression)

Posted by c0t0d0s0.org on January 15, 2010 at 03:01 AM PST #

Post a Comment:
  • HTML Syntax: NOT allowed
About

Hands-on experience with Virtualization

Search

Categories
Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today