Tuesday Mar 31, 2009

Intelligent Performance - Solaris optimized for Nehalem

Nehalem_Solaris.pngOver the last two years, Sun and Intel have been working together - from design and architecture through implementation - to ensure that Solaris is optimized to unleash the power and capabilities of current and future Intel Xeon processors. The compelling results include:
  • Increased performance as the Solaris OS takes advantage of Intel multi-core processor capabilities and Intel Turbo Boost Technology
  • Optimized power efficiency and utilization by enabling Solaris to take advantage of Intel Xeon processor 5500 (aka Nehalem) series performance-enhanced dynamic power management capabilities.
  • Extending Predictive capabilities to improve reliability by incorporating Nehalem features into the Solaris Fault Management Architecture (FMA).
Now, lets talk more in-depth about the innovations within Solaris in combination with Intels Nehalem Architecture.

Intelligent Performance

We have optimized the performance of Solaris for individual cores and the overall multi-core microarchitecture, which increases both single- or multi-threaded performance. Intel Turbo Boost Technology uses any available power headroom to deliver higher clock rates. In those situations where the application requires maximum processing power, the Intel Xeon processor 5500 series increases the frequency in the active core when conditions such as load, power consumption and temperature permit it.

The Solaris threading model provides a sophisticated performance with specific optimizations for the new Nehalem Architecture. Solaris also takes advantage of the capabilities of the new Intel QuickPath Interconnect QPI with capabilities such as an optimized scheduler and memory placement optimization (MPO) capability that has proven performance benefits with non-uniform memory access (NUMA). This reduces the memory latency. The Solaris NUMA implementation takes information from the Advance Configuration and Power Interface (ACPI) System Resource Affinity Table (SRAT) and the System Locality Information Table (SLIT).

Modern processors provide the ability to observer performance characteristics of applications using performance counters. Solaris provides the libcpc (\*LIB) API to access these performance counters. These interfaces can be used to observe the performance characteristics of applications. Following utilities provide you information from libcpc:
  • cpustat -h provides a listing of all the different events that are available on a give processor.
  • cputrack is used to analyze performance charateristics on a per-process or per-LWP basis
  • Performance Analyzer tools like collect, analyzer and er_print.
The DTrace CPU Performance Counter provider (cpc provider) makes available probes associated with processor performance counter events.

Automated Energy Efficiency

Solaris takes advantage of many power efficiency features in the Nehalem architecture. For example, an innovative Power Aware Dispatcher (PAD) has been integrated into OpenSolaris, enabling granularity in power management (P-states). We have seen a substantial reduction in idle power consumption, lower power cunsumption at maximum cpu utilization, and improved performance when switching between power states.

The kernel dispatcher - the part of the kernel that decides where threads should run - is integrated with the power management subsystem of the Nehalem cpu. Therefore the kernel now has the ability to utilize those parts of the processor that are active, and continue to avoid doing work on those parts that are powered down.

PowerTOP is a new command line tool that shows how effectively a system is taking advantage of the processor's power management features. The application observes the system on an interval basis and displays a summery of how long the processor is spending (on average) at each different state.

Reliability and Availability

Sun and Intel are working together to extend the capabilities of the Solaris Fault Manager by also supporting the Machine Check Architecture (MCA). The Fault Manager automatically diagnoses the underlying problems and responds by off-lining faulty components.

The ability to fast reboot a system drastically reduces downtime and improves efficiency. Fast Reboot is a command-line feature that enables you to reboot an Intel Xeon processor 5500 series system quickly, bypassing the BIOS, power on self test, and the GRUB Bootloader. Fast Reboot (reboot -f) implements an in-kernel boot loader that loads the kernel into memory and then switches to that kernel, so that the reboot process occurs within seconds.

The ability to work around processor errata in the operating system by applying microcode updates is available in Solaris. This support alleviates the need to upgrade a system's BIOS every time a new microcode update is required.


We have made optimizations to the compiler tools and runtime libraries including full support for Streaming SIMD Extensions (SSE 4.2). To enable the automatic usage for SSE instructions, specify -xvector=simd in your compiler options.

To speed up serial application performance on multithreaded chips like the Intel Xeon 5500 Series, you can use the compiler option -xautopar. The compiler will then generate codes which, when executed at runtime, will have more than one thread to execute the loop body.

OpenMP is the de-facto standard for writing multi-threaded applications to run on multi-threaded machines. OpenMP specification version 3.0 defines a rich set of directives, runtime routines, and environment variable that allow the programmer to write multi-threaded applications in C, C++, and Fortran. The Sun Studio software facilitates OpenMP development. The main motivations for using OpenMP are performance, scalability, portability, and standardization. With a relatively small amount of coding effort, a programmer can write multi-threaded applications to run on multi-threaded machines.


This video from David Steward (Intel) also provides some insights about the improvements in Solaris for the Nehalem architecture:

Sun provides everything necessary to get the best out of the latest Intel Xeon 5500 series CPU. Starting at the Hardware with our innovative X-Series Servers, the Solaris operating system with all its enhancements, and ending with the complete compiler environment that leverages all features for your application.

You can find more information on the following links:

Tuesday Feb 10, 2009

Distributed Computing

cluster.pngToday everybody thinks about distributed computing, virtualization and data center optimization. So, within the next few blog entries I am going to talk about distributed computing.

Distributed Computing and its Market

In history, distributed computing was often used in Education and Research. In the last year, the trend has tremendously changed. Finance and Insurance Companies are more and more moving away from big iron systems into grid based solutions and distributed computing. It is not only much cheaper, but also much more efficient. Typical applications for grid computing are, Risk Analysis, Instant Credit Calculations, 3D Calculations and any Simulations like Weather, Earth Quake, Semiconductor, etc.. In fact distributed computing is great for every application that can run multiple processes in parallel on different systems.

What do I need for Distributed Computing

Basically distributed computing is very simple and you just need a few tools and systems to run your own Grid / Cluster / HPC. Bellow is a short overview on what you need to go Distributed:
  • 2 + N Server Nodes with same software stack
  • Centralized Storage that can be accessed from any System Node
  • A Scheduling Software that manages the processes which are distributed to the System Nodes
  • A Monitoring Tool that give you some insights on how efficient your System Nodes are being used
  • A Network for the efficient communication between the System Nodes, preferably > 1Gb Ethernet. Optimal would be Infiniband!


In the next few blogs entries, I am going to talk about:
  • Mastering the Grid
  • Storage in the Grid
  • Monitoring the Grid
  • Into the Cloud!
So, Have fun!

Wednesday Feb 04, 2009

The Screaming Fast Sun Modular Storage 6000 Family!

Did you know that both Sun StorageTek 6140 and 6540 disk arrays, which belong to our Modular Storage Line, are still leading the price/performance rankings in their class? Feel free to verify at StoragePerformance.org. The modular approach and the ability to upgrade from the smallest to the biggest system just by exchanging controllers is very unique and our customers love this investment protection!

The Uniqueness of the Sun Storage 6000 Familiy

Today, the 6000 modular storage portfolio looks as follows:
  • 6140-2 (up to 64 Disk Drives mixed FC and SATA)
  • 6140-4 (up to 112 Disk Drivers mixed FC and SATA)
  • 6540 (up to 224 Disk Drives mixed FC and SATA)
  • 6580 (up to 256 Disk Drives mixed FC and SATA)
  • 6780 (up to 256/448\* Disk Drives mixed FC and SATA)

All 6000 series arrays are using an ASIC (Application Specific Integrated Circuit) to do RAID operations. This results in a very low latency overhead and a guaranteed performance. The publicized cache volume is 100% dedicated to the ASIC and can't be accessed by the management CPU, which for example in case of the 6780 has a separate 2GB RAM. In the complete family, you have upgrade protection.

You can start with a 6140-2 and seamlessly upgrade to a 6780 by just replacing the contollers! No configuration changes or exports are necessary, as the complete RAID configuration is distributed to each single disk in the array. You can also move a complete RAID group to a different array in the family. Certainly you better take care that both are running on the same firmware level. ;-)

Sun StorageTek 6780 Array

As of today, Sun announced its latest and greatest midrange disk array. It is completing the modular line as the high end model of the 6000 series. The connectivity of the Storage Array and its features are very impressive and pretty unique in the midrange segment!
  • Replaceable Host Inteface Cards (two per Controller)
    • Up to 16x 4Gb or 8Gb\* FC Host Channels
    • Up to 8x 10Gb\* Ethernet Host Channels
  • 16x 4Gb FC Drive Channels
  • Up to 16x/28x\* Drive Enclosures
  • Up to 32GB\* dedicated RAID Cache
  • RAID 0,1,3,5,6,10
  • Up to 512 Domains = up to 512 servers with dedicated LUN mapping can be attached to the array
  • Enterprise Features:
    • Snapshot
    • Data Copy
    • Data Replicaton
Bellow are some insights about the architecture of the 6780 Array:


The internal flash storage allows longterm power outages without loosing IO that is not yet written on disk. As you can see, each drive chip has access to all disk drives. Everything in the controller and drive enclosure has at least a redundancy factor two. In some cases like the drive chips we have even a higher redundancy factor.


The expansion trays are SBODs (Switched Bunch Of Disks) and therefore limit the impact of a drive failure. Most other vendors still use looped JBODs. In such a case, a loop is vulnerable if a drive fails. In worst case a complete tray could fail just because of a failing drive. Also looped BODs are slower than switched BODs.


Due to the high amount of drive channels, the maximum drive count per dual 4Gb FC loop is 64 (with 448 Drives). With 256 Drives, you will only have 32 drives per dual 4Gb FC loop. Due to this fact, and the dedicated ASICs for RAID calculations, the 6780 array can do up to 175'000 IOPS and 6.4GB/s throughput in disk read operations. This is for sure the top rank in the midrange segment!


Latest by now, you should know that Sun is NOT a me too manufacturer in the storage business. Our modular storage family uses leading edge technology and delivers investment protection by providing an easy upgrade path the the next higher controller level.

\*Will be available after initial release.

Sunday Feb 01, 2009

Traditional Arrays vs "The Open Storage Approach"

Why should I still use a traditional Array?

You may ask yourself why you should use a traditional array, if Sun is pushing towards OpenStorage? Good question! Now, as there isn't a cow that provides, milk, coke and beer, there isn't a storage product that does everything for you ... today. So, while our OpenStorage family is today perfect for IP network oriented access like CIFS, NFS and iSCSI it doesn't cover yet the FC block attached community. And despite all honour that ZFS and OpenSolaris deserve, an ASIC, if you have the money and the skills to build one, will do faster RAID calculations. ASICs do not require an operating system underneat the RAID code, which results in far less latency in calculation.

The Unanswered Question ...

There is one unanswered question that remains in the IT business. How long can companies afford to build ASICs that keep up with the performance increases in the volume business? ASICs, as the name states, are built for a certain purpose and therefore manufactured in a much lower volume. Means, they are simply much more expensive than general purpose built CPUs.

An other question might give you an impression of the future. Who is still programming Assembler? Every programmer knows that if you write perfect Assembler Code, no but really NO C, C++ or Java program will ever run faster than your Assembler program, right? But, programming Assembler gets so complex that you can't manage anymore your code. That's why we use abstraction layers to simplify your business.... Got a hint?

Now, there is also a huge design problem with a dedicated ASIC. You cannot extend its features by just upgrading the software as it is hardware. An ASIC can do what it's built for, and therefore is very limited in extending features! In a manufacturing and design perspective, this can be very limited. One little thing missing or wrong in a ASIC, and you will fail with the complete product without the chance to fix or change it. Uhhh, you better make no mistake ...


So depending on your requirements, you will have to choose the appropriate technology! If you can afford the no compromise way of storage, the best solution is to have both or maybe a combination of each. :-)

In a long term perspective, I only see one solution that survives. The combined approach of commodity hardware and software, provides the key elements that will succeed. This namely are:
  • Great price/performance
  • Possibility to add features (in best case for free) with easy upgrades
If the used software is open sourced, you suddenly have the ability to add features yourself to the subsystem! One example is the project COMSTAR that turns an OpenSolaris host into a SCSI target.


So, you better keep the OpenStorage Vision of Sun in your mind.

See how the L2ARC in OpenStorage works

Brendan Greg from the Fishworks Engineering Team did a tremendous job in testing the behavior of L2ARC in a well populated Sun Unified Storage Server 7410. It is a great introduction how the L2ARC in combination with SSD technology works! To read more about it, click here.

Thursday Jan 15, 2009

OpenSolaris 2008.11, Closing the cap or leapfrogging Linux?

opensolaris.png When Solaris 2008.05 was released in May 2008 the feedback from several communities and magazines was in general positive but with several hints that OpenSolaris still has to go a long way to close some gaps to Linux. Often they have mentioned the amount of pre-compiled (packaged) applications available, the package management and lack of drivers for certain devices.

With the release of OpenSolaris 2008.11 we can say that we have addressed many of the mentioned topics. We have also improved other areas that we believe are simply unique in the Linux/UNIX world! Very often people just talk about the front-end and applications that run on your desktop by ignoring that OpenSolaris is not just an other Linux, in fact it is a UNIX!

Since the release of OpenSolaris 2008.05 the installed base has doubled and more than 80 OpenSolairs user groups today spreading around the world.

Let me now introduce some new features and key differentiators we have compared to any other Linux/UNIX:

If you look what made OpenSolaris more and more popular, it is certainly ZFS and DTrace.
  • Now DTrace is probably not the feature that you would use as a standard computer user, but it is essential for every developer to help him improve the quality of his application and discover bugs. DTrace allows you to debug your application on the fly while it is running in OpenSolaris without special debugging code (binaries). There is no Operating System that is capable of doing this, except FreeBSD and Mac OS X. The guys at FreeBSD are adopting and implementing a lot of cool features introduced first in Solaris and OpenSolaris, like ZFS.

  • Many people are switching to OpenSolaris because of ZFS. The very simple interface and the tremendous functionality of ZFS make this file system the best file system ever. This leads me to the new added feature called Time Slider. Time Slider is a combination of features built on top of ZFS, integrated into the GNOME file browser. It basically allows you to slide the time of your file system back and recover files that you have for example accidentally deleted or modified. You can compare this feature with Time Machine on Mac with the exception that you don't need an external disk to do this!
This video shows how Time Slider looks and works:

We have tremendously improved the application stack including:
  • updated GNOME to release 2.2.4
  • latest version of Firefox including DTrace probes for debuging web applications and behaviour
  • integration of Songbird which is built on the Mozilla Framework and is simply a great music player with a touch/look-and-feel from iTunes ;-)
  • new fully featured AMP stack including Ruby and Dtrace probes for easier trouble shooting
  • integration of OpenOffice 3.0
Further cool enhancements are:
  • more Open Source Software Support than ever before
  • tunings of OpenSolaris core components
  • support and performance optimizations for Intel Core Micro Architecture (upcoming Intel CPUs, Codename Nehalem)
  • introducing power efficiency optimizations into datacenters and not only notebooks
  • virtualization optimizations to optimize runtime environments
  • improved package management software
  • proper and clean sleep and resume functions for notebooks
See what Intel Says about their new Core Micro Architecture and the relationship with OpenSolaris!

Intel and AMD are also helping in optimizing the IOMMU which manages DMA. Wan't to know what DMA and IOMMU is? Check this Video for more information.

One great and unique feature of OpenSolaris is the simple and easy distribution upgrade. No reason to worry if you are not satisfied with the new release of the operating system. When upgrading to a next release, the system creates automatically a bootable ZFS snapshot of your old release, allowing you to boot your old environment whenever you want. Which Linux does this? There are also not many Linuxes that allow easy major release upgrades. As far as I know only the ones that are based on the Debian package manager.

Further news are that Sun & Toshiba announced pre-configured OpenSolaris notebooks that will be available soon, as well as the certification of Zamandas backup solutions for OpenSolaris.

The combination of Zmanda’s robust backup solutions (Amanda) with Sun’s innovative OpenSolaris operating system, including the advanced ZFS file system, creates one of the most advanced backup-to-disk offerings available on the market today. Specifically, the snapshot capability of ZFS enables fast and scalable backups of today’s most demanding workloads. With the new features and record-breaking performance introduced in OpenSolaris 2008.11, we demonstrate our commitment to rapidly innovating on the open source ecosystem.

Amanda Enterprise is an enterprise-grade network backup solution based on Amanda, the world’s most popular open source data backup and recovery software. It is a powerful, low-cost, open source solution that protects OpenSolaris, Solaris, Linux, Windows, and Mac OS X environments using a single management console.

Did you know, our Sun Storage 7000 Series is also certified for Amanda Enterprise Backup.


Well we might not have closed the gaps in the desktop area completely to the Linux community, but OpenSolaris is clearly leapfrogging Linux on the backend by using ZFS as the most flexible, scalabale and innovative file system ever and providing unique DTrace functionality to our open source developers. Today OpenSolaris is my personal choice for a Web 2.0 based environment!

Tuesday Nov 25, 2008

ZFS - A Smashing Hit

See how Jim Hughes demonstrates how ZFS compresses data and if failure occurs, the data is still there! This is a very practical video ;-) Have fun!

Thursday Nov 13, 2008

New Class Of Storage Systems - Sun Storage 7000 Unified Storage Systems

STK7410_Rack.pngI have been blogging for a while about Open Storage, ZFS Hybrid Storage Pools and Solid State Disks. Now the Products that combine all of those technologies are available and will disrupt the complete storage market!

You may ask why? Here are a few reasons:
  1. There is simple no price competitive system on the high end market
  2. Our system has no license fees - All inclusive incl. future features
  3. There is no other system that has in depth built-in analytics
  4. There is no other system that combines new technology SSD and traditional storage
  5. No other system has a rock solid OS like Solaris with all its features like DTRACE, Fault Management Architecture (FMA)
I could mention dozens of more reasons, but that should be already enough to seriously consider those systems in your business!

So, said enough marketing stuff, I would like to go a bit deeper and give you a short introduction of the products and its extraordinary features!

Sun Storage 7000 Unified Storage Systems

We have announced three different Unified Storage Systems for the beginning. The two smaller version are single node systems, while the 7410 can be used in a active/active cluster (2 Nodes). All systems are fully licensed and run the same OS. There are no features restrictions an the smaller systems, except the one given by the hardware configuration.

Sun Storage 7110 Unified Storage System

STK7110.pngThe 7110 is the entry level Unified Storage System. It has following hardware Specifications:
  • 14x Usable Disk Drives
  • Quad-Core Opteron
  • 8GB RAM
  • 4x 1 Gigabit Ethernet Ports
  • 6x PCI-E Slots per Node
  • 1Gb-E and 10 Gb-E Network Interface Cards
  • FC/SCSI HBA Options for Backup/Restore
Today the system is equipped with 14x 146GB 10k RPM Disks. In the future you will be able to have it also equipped with 14x 500GB SATA Disks.

The 7110 is the only system that cannot be equipped with SSDs for now. The system is perfectly suited as a workgroup storage and just uses 2u Rack Space.

Sun Storage 7210 Unified Storage System

STK7210.pngThe 7210 is the ultimate dense Unified Storage System. It doesn't only have a lot of disks but also quite some Caching and CPU power. Here are some hardware specifications:
  • 44-46x Usable Disk Drives
  • 0-2x LogZilla 18GB SSDs
  • Dual Quad-Core Opteron
  • 32GB/64GB RAM
  • 4x 1 Gigabit Ethernet Ports
  • 3x PCI-E Slots per Node
  • 1Gb-E and 10 Gb-E Network Interface Cards
  • FC/SCSI HBA Options for Backup/Restore
The 7210 is the ultimate dense storage pod! In combination with the 44TB Storage and the LogZilla write acceleration the system can provide up to 780MB/sec throughput! All of this in only 4u Rack Space!

Sun Storage 7410 Unified Storage System

T7410_Single_Node.pngThe 7410 is our highly available and performant Unfied Storage System. The 7410 System supports two configurations, a single node and a 2-node cluster for high availability. Each configuration has three levels, an Entry, Mid and High level, where the main differences are in computer power. Here is an overview of the hardware specifications:
  • Up to 576x Usable Disk Drives
  • Up to four Quad-Core Opteron per Node
  • Up to 4x LogZilla 18GB SSDs per Node
  • Up to 6x ReadZilla 100GB SSDs per Node
  • up to 128GB RAM per Node
  • 4x 1 Gigabit Ethernet Ports per Node
  • 6x PCI-E Slots per Node
  • 1Gb-E and 10 Gb-E Network Interface Cards
  • FC/SCSI HBA Options for Backup/Restore
T7410_Dual_Node.pngHave you ever seen a storage system that had 128GB Cache per Controller? We go even further by adding 600GB L2ARC Cache! So in fact if you go for the big cluster, you will have 256GB L1ARC Cache and 600GB L2ARC Cache. Again this is where we start today, imagine how much cache we will have in the future.

The 7410 is based on compute nodes (Heads) and storage nodes (JBODs). In regards to compute power you have 16 cores per head to do all storage and file system related work. In a clustered configuration you will have 32 cores that can perform in parallel (active/active)! The heads together have therefore a theoretical IO capability of more than 1.6 Mio IO/s per second!

A storage node is a 4u rack mountable chassis that can hold up to 24 disks. You can attach up to 24x storage nodes to this system which will give you a total of 576 disk drives.

The Sun Storage 7410 implements a true ZFS Hybrid Storage Pool with support for Flash-memory devices for acceleration of Reads (100GB Read Flash Accelerator, aka Readzilla) and Writes (18GB Write Flash Accelerator, Logzilla). Multiple configurations are provided on both the node configuration and the expansion array to accommodate the most demanding customer application performance requirements. You can find more details about the SSD Integration bellow in the feature section.

Extraordinary Features

Now as we have seen what hardware features the three Unified Storage Systems have, I wold like to go a bit deeper into the software features. These are in fact the features that make this products so unique and interesting!

SSD Integration / Hybrid Storage Pools

The Sun Storage 7000 system uses a Flash Hybrid Storage Pool design, which is composed of optional Flash-memory devices for acceleration of reads and writes, low-power and high-capacity enterpriseclass SATA disks, and DRAM memory. All these components are managed transparently as a single data hierarchy, with automated data placement by the file system. In the Storage 7410 model, both Write Flash Accelerator (write-optmized SSD, aka LogZilla) and Read Flash Accelerator (read-optimized SSD, aka ReadZille) are used to deliver superior performance and capacity at lower cost and energy consumption than competitive solutions. The Storage 7210 currently implements only write-optimized SSD, and the Storage 7110 does not currently implement this design.

ZFS provides two dimensions for adding flash memory to the file system stack, and improve overall system performance: the L2ARC (Level 2 ARC) for random reads, and the ZIL (ZFS Intent Log) for writes. The L2ARC (ARC is the ZFS main memory cache in DRAM) sits in between memory cache and disk drives and extends the main memory cache to improve read performance. The ZFS Intent Log uses Write-Flash SSD disks as log devices to improve write performance.

The main reason why we have chosen different SSDs (ReadZilla, WriteZilla) lays on the fact that flash based SSDs are still quite expensive and have some limitations in how they write data. The WriteZilla SSDs have a more complex controller chip that can handle thousands of write IO/s, a bigger DRAM cache and a capacitor that assures that in case of a power outage no IO gets lost between DRAM and the flash chips. WriteZilla SSDs are therefore optimized on writes while the ReadZilla SSDs are optimized on read operations.

Realtime Analytics

g20_abr_feature1_zoom.pngRealtime Analytics is one of the coolest features in this product and was only possible because Solaris has DTrace builtin. The Sun Storage 7000 Systems are equipped with Dtrace Analytics, an advanced DTrace-based facility for server analytics. DTrace Analytics provides real-time analysis of the Storage 7000 System and of the enterprise network, from the storage system to the clients accessing the data. It is an advanced facility to graph a variety of statistics in real-time and record this data for later viewing. It has been designed for both long term monitoring and short term analysis. When needed, it makes use of DTrace to dynamically create custom statistics, which allows different layers of the operating system stack to be analyzed in detail.

g20_abr_feature2_zoom.pngAnalytics has been designed around an effective performance analysis technique called drill-down analysis. This involves checking high level statistics first, and to focus on finer details based on findings so far. This quickly narrows the focus to the most likely areas.

So how does this work?

You may discover a throughput problem on your network. By selecting the interface that causes you some headache, you can drill down by protocol and even deeper onto the NFS client that causes the high load. Well we don't stop here and can drill down further to figure out what kind of files the nfs client is accessing at what latency, etc. DTrace Analytics creates datasets as you are drilling down. These datasets can be stored and reused at a later time. The analytic data is not discarded - if an appliance has been running for two years, you can zoom down to by-second views for any time in the previous two years for your archived datasets. The data is stored on a compressed file system and can be easily monitored. You can destroy datasets on demand or export them as CSV.

Other Features

The Unified Storage Systems have a lot of other features which I will cover in short.

Data Compression

Data compression is useful because it helps reduce the consumption of expensive resources, such as hard disk space or transmission bandwidth. The Sun Storage 7000 System software supports 4 levels of data compression, LZJB and 3 levels of GZIP. Shares can optionally compress data before writing to the storage pool. This allows for much greater storage utilization at the expense of increased CPU utilization. In the Sun Storage 7000 family, by default, no compression is done. If the compression does not yield a minimum space savings, it is not committed to disk to avoid unnecessary decompression when reading back the data.


A snapshot is a read-only copy of a file system or volume. Snapshots can be created almost instantly, and initially consume no additional disk space within the pool. When data within the active dataset change, the snapshot consumes disk space by continuing to reference the old data and so prevents the space from being freed. Snapshots are the base for replication and just-in-time backup.

Remote Replication

The Sun Storage 7000 Remote Replication can be used to create a copy of a filesystem, group of filesystems or LUNs from any Storage 7000 System to another 7000 system at a remote location through an interconnecting TCP/IP network that is responsible for propagating the data between them. Replication transfers the data and metadata in a project and its component shares either at discrete, point in time snapshots or continuously. Discrete replication can be initiated manually or occur on a schedule of your own creation. With continuous replication, data is streamed asynchronously to the remote appliance as it's modified locally at the granularity of storage transactions to ensure data consistency. In both cases, data transmitted between appliances is encrypted using SSL.

iSCSI Block Level Access

The Sun Storage 7000 family of products act as a iSCSI target for several iSCSI hardware and software initiators. When you configure a LUN on the appliance you can specify that it is an Internet Small Computer System Interface (iSCSI) target. The service supports discovery, management, and configuration using the iSNS protocol. The iSCSI service supports both unidirectional (target authenticates initiator) and bidirectional (target and initiator authenticate each other) authentication using CHAP. Additionally, the service supports CHAP authentication data management in a RADIUS database. You can even do thin provisioning with iSCSI Luns. Means they grow on demand.

Virus Scan

This feature allows the Storage 7000 family to be configured as a client of an antivirus scan engine. The Virus Scan service will scan for viruses at the filesystem level. When a file is accessed from any protocol, the Virus Scan service will first scan the file, and both deny access and quarantine the file if a virus is found. Once a file has been scanned with the latest virus definitions, it is not rescanned until it is next modified.

NDMP Backup and Restore

Backup and restore is one of the primary goals of enterprise storage management. Backup and restores should be in a timely, secure, and cost effective manner over enterprise wide operating systems. Companies need high performance backup and the ability to back up data to local media devices. While the data itself may be distributed throughout the enterprise, its cataloging and control must be centralized. The emergence of network-attached storage and dedicated file servers makes storage management more challenging. Network Data Management Protocol (NDMP) recognizes that these issues must be addressed. NDMP is an opportunity to provide truly enterprise-wide heterogeneous storage management solutions - permitting platforms to be driven at a departmental level and backup at the enterprise level.

The Sun Storage 7000 Systems support NDMP v3 and v4

Phone-Home of Telemetry for all Software and Hardware Issues

Phone-home provides automated case opening when failures are detected in the system. This assures faster time to resolutions and reduces the time to figure out what the problem might be.

End-to-End Data Integrity and self-healing mechanisms

The Sun Storage 7000 systems include FMA (Failure Management Architecture) which provides the capability to detect and take faulty hardware components offline in order to prevent system disruption. In addition, to avoid accidental data corruption, the ZFS file system provides memory-based end-to-end data and metadata checksumming with self-healing capabilities to fix potential issues. FMA combined with ZFS data integrity facilities, make the sun Storage 7000 the most comprehensive self-healing unified storage system.


What makes this system so screaming cool? It is simply the combination off all features, starting at the hardware with the SAS protocol, the incredibly high amount of caches, the integration of SSD technology going to the soft features like real time analysis, end-to-end data integrity, FMA (Fault Management Architecture), and finally its foundation on open source technology (OpenSolaris, ZFS, and many other open source projects) that assures future innovation. Features like, encryption, de-duplication and FC-target mode are on its way. And you know what, you will get them all at no additional license cost! That is what I call investment protection.

If you don't consider these Unified Storage Systems at your next IT investment, you are simply ignoring facts and may spent far too much money for a limited featured product.

Monday Aug 04, 2008

Sun / Avaloq Banking Platform

About Avaloq

avaloq.pngThe Avaloq Banking System is an innovative and integrated IT platform which embraces modern banking practices. It is an ideal solution for asset managers, plus private, retail and commercial banks, wanting to increase their business efficiency and intending to protect their competitive advantage and long-term profitability. Avaloq's modular and open architecture provides comprehensive functionality, covering a variety of banking products, and enables the optimisation and break down of the value chain. Their flexible design allows financial institutions to adapt swiftly to changing market conditions, including the ability to rapidly launch new products and implement new business models.

Avaloq has a very fast growing customer base and is one of the most innovative banking solutions available today!

Sun's Avaloq Infrastructure

M5000_Front_Bird.pngThe Sun Infrastructure Landscape suits perfectly the Avaloq Banking requirements. We have discovered that the M-Series Servers are the ideal systems for a high scalable Avaloq Platform.

The M5000 System is the most frequently used Sun server for Avaloq implementations.

The major reason for that is it's internal scalability and realiability. No matter if it is for the project phase (implementation, integration, testing) or for production, the M5000 matches most requirements from our customers.

With the latest announcements of the Sparc 64 VII CPUs, the M-Series got a tremendous performance boost. The performance/power efficiency has been increased by 50% while the core density has been doubled! 32x Sparc 64 VII Cores at 2.4 Ghz within one single server! With todays 4GB DIMM's the server scales up to 256GB RAM. Nearly unlimited I/O expandibility delivers high performance connectivity to storage and network. Up to two internal I/O Units can be configured for the M5000, while each I/O Unit delivers:
  • 4 x8 PCI-E Slots
  • 1 64-bit PCI-X Slot
  • 2 SAS Disk Bays
  • 2 Gigabit Ethernet Ports
If this is not enough, you can attach up to four external I/O Units, while each external I/O Unit has two houses. Each house can deliver 6 PCI-E Slots or 6 PCI-X Slots. In case of a full expansion of the system, you could for example have up to 32 PCI-E slots ...... Enough? I think so.

Now that we know which system is mostly used for Avaloq implementations, let's figure out why and what the sizing rules are!

Typical Sun Avaloq Sizings

The following figure shows a possible implementation scenario for a medium sized bank:


A Typical SUN Avaloq Implementation is fragmented into three different areas:
  1. The Production Servers and Integration Servers are preferably identical.
  2. The Project, Development and Test Server needs in 99% of the cases more compute and memory power than the Production and Integration Environment. This is related to the amount of databases being used.
The Avaloq Project Phase is the most demanding phase for a server. During this time multiple Avaloq instances are running, for example to verify the implementation or to test various features and modifications.

In most of the cases such Avaloq instances are running in separate Solaris Zones, a builtin and free Solaris feature. In combination with Jumpstart, a complete Avaloq instance can be brought online from scratch within a few minutes!

High Availability Production System

As banking applications are mostly core for every Bank, you wouldn't like to have any interruptions at all. BUT, as systems, networks, storage and datacenters can fail even when they have the most complete RAS (Reliablility, Availability and Servicability) stack, it is wise to have a good failback scenario.

The figure bellow illustrates a possible HA implementation scenario. Certainly this could ba also expanded.


Avaloq Sizing Callenge

Depending on what kind of bank you are, the size of your production system may vary. Why? There are two major types of banks on the market:
  • Retail Banks
    A retail bank traditionaly has much more cashier transactions per day than a private bank thus causing a higher system load.

  • Private Banks
    Traditionaly a private bank doesn't have as many cashiers transactions than a retail bank but it has much more STEX (Stock Exchange) transactions.
It is wise to work together with Avaloq and Sun to design the optimal sizing for your specific needs. Avaloq and Sun have done many sizings together, and this is a major reason why our customers are happy!

The performance bottleneck!

6122-babasse-DiskipperLite3D.pngAs most DB oriented applications, also Avaloq implementations are higly dependent on the underlying storage subsystem if it is not capable of storing the transactions fast enough or creating complex reports.

The good message is, that Sun has the right answer in case you have to deal with such kind of problems. Sun does not only deliver the fastest midrange and high end storage subsystems, we also have the right answer when we start talking about Solid State Disks (SSDs)

One of the first operations to increase performance can be that you place the Oracle Redo-Log on Solid State Disks. Also a very Dense Solid State Disk could cover the complete Avaloq implementation. The disks subsystem will no longer be the bottle neck!

Keep in mind that in terms of performance increase, the NAND based solid state disk market is today growing faster than the CPU market. You can find more about the SSD technology here.

Sunday Jul 20, 2008

Open Storage - The (R)Evolution

Why Pay more for Less?

Do you pay incredibly high license and maintenance fees for your Network Attached Storage? Are you locked into a vendor with proprietary Operating Systems and Protocols? Do you question yourself why you should pay just for using NFS, CIFS or NDMP which are standards since years?

You might answer all of the above mentioned questions with a big and bold YES. If this is the case, then keep reading this blog entry and you will see that there is an other WAY or PERSPECTIVE to go into the next decade of Open, Reliable and Fairly Priced Storage Solutions!

You will recognize that there is only one Vendor that fullfills the following topics:
  • Open Source Software and Operating System Stack
  • No proprietary hardware and drivers
  • 128Bit Transaction Oriented File System
  • Usage of fair priced SAS (Serial Attached SCSI) Connectivity
  • Hybrid Storage Concept
  • Usage of Solid State Technology to increase performance
And this vendor is SUN Microsystems!

I am now finished with the marketing part. Let's see how Sun Microsystems can help you optimize your Storage and Data Services!

The Open Storage Concept

As a general term, open storage refers to storage systems built with an open architecture, customers can select the best hardware and software components to meet their requirements. For example, a customer who needs network file services can usa an open storage filer built from a standard x86 server, disk drives, and OpenSolaris technology at fraction of the cost of a proprietary NAS appliance.

Almost all modern disk arrays and NAS are closed systems. All the components of a closed system must come from that specific vendor. Therefore you are locked into buying drives, controllers and proprietary software features at premium prices and typically you cannot add your own drivers or software to improve the functionality of this product.

The Open Storage Software

OpenSolaris is the cornerstone of Sun Open Storage offerings and provides a solid foundation as an open storage platform. The origin of OpenSolaris technology, the Solaris OS, has been in continous production since September 1991. OpenSolaris offers the most complete open source storage software stack in the industry. Below is a list of current and planned offerings:

At the storage protocol layer, OpenSolaris technology provides:
  • SCSI
  • iSCSI
  • iSNS
  • FC
  • FCoE
  • InfiniBand software
  • RDMA
  • OSD
  • SES
  • SAS
At the storage presentation layer, OpenSolaris technology offers:
  • Solaris ZFS
  • UFS
  • SVM
  • NFS
  • Parallel NFS
  • CIFS
  • MPxIO
  • Shared QFS
  • FUSE
At the storage application layer, OpenSolaris technology offers:
  • MySQL
  • Postgres
  • BerkeleyDB
  • AVS
  • SAM-FS
  • Amanda
  • Filebench

Solaris ZFS

One of the key cornerstones of Sun's open storage platform is the Solaris ZFS file system. Solaris ZFS can address 256 quadrillion zettabytes of storage and handle a maximum file size of 16 exabytes. Several storage services are included in ZFS:
  • Snapshots
  • Point-in-time copy
  • Volume management (no need for additional volume managers!)
  • Command line and GUI oriented file system management
  • Data integrity features based on copy-on-write and RAID
  • Hybrid Storage Model
Vendors of closed storage appliances typically charge customers extra software licensing fees for data management services such as administration, replication, and volume management. The Solaris OS with Solaris ZFS moves this functionality to the operating system, simplifying storage management and eliminating layers in the storage stack. In doing this, Solaris ZFS changes the economics of storage. A closed and expensive storage system can now be replaced by a storage server running Solaris ZFS, or a server running Solaris ZFS attached to JBOD.

Solaris ZFS recently won InfoWorld’s 2008 Technology of the Year award for best file system. In the InfoWorld evaluation, the reviewer stated, “Soon after I started working with ZFS (Zettabyte File System), one thing became clear: The file system of the next 10 years will either be ZFS or something extremely similar.”

ZFS Hybrid Storage Model

zfs_hybrid_storage_model.png The ZFS Storage Pools have an extreme flexibility in terms of placing data on the optimal storage devices. You can basically split a storage pool in to three different sections:
  1. The High performance Read & Write Cache Pool
  2. The high performance read & write cache pool combines the systems main memory and SSDs for read caching. As you imagine, we are using SSDs (Solid State Disks) which have a big advantage compared to RAM and traditional disks. SSDs are NOT volatile as RAM, and they are much faster than traditional disks. Therefore you don't need to first load the data into the memory to become fast! Traditionally less than 10-20% of a file system are realy used often or need high performance. Imagine that exactely this part is stored on the SSD technology. The result is a grazy fast file system ;-) You can read more about how ZFS technically does this in an other blog entry soon.

  3. ZFS Intent Log pool
  4. All file system related system calls are logged as transaction records by the ZIL. The transaction records contain sufficient information to replay them back in the event of a system crash.

    ZFS operations are always a part of a DMU (Data Management Unit) transaction. When a DMU transaction is opened, there is also a ZIL transaction that is opened. This ZIL transaction is associated with the DMU transaction, and in most cases discarded when the DMU transaction commits. These transactions accumulate in memory until an fsync or O_DSYNC write happens in which case they are committed to stable storage. For committed DMU transactions, the ZIL transactions are discarded (from memory or stable storage).

    As you must have figured out by now, ZIL performance is critical for performance of synchronous writes. A common application that issues synchronous writes is a database. This means that all of these writes run at the speed of the ZIL. The ZIL is already quite optimized, and efforts will optimize this code path even further. Using solid state disks for the log make this screaming fast!

  5. High Capacity Pool
  6. The biggest advantage of traditional HDDs is the price per capacity and density value, which is until today unbeaten for online storage. While combining different technologies within a file system, you can now choose SATA technology for the high capacity pool while not loosing performance in the overall prespective. The ZFS pool manager automatically stripes across any number of high capacity HDDs. The ZFS IO-scheduler bundles disk IO to optimize arm movement and sector allocation.
Again, I will post more details about ZFS and the Hybrid Storage Concept soon in an other blog entry.

Solaris DTrace

Solaris DTrace provides an advanced tracing framework and language that enables users to ask arbitrary diagnostic questions of the storage subsystem, such as “Which user is generating which I/O load?” and “Is the storage subsystem data block size optimized for the application that is using it?” These queries place minimal load on the system and can be used to resolve support issues and increase system efficiency with very little analytical effort.

Solaris FMA - Fault Management Architecture

Solaris Fault Management Architecture provides automatic monitoring and diagnosis of I/O subsystems and hardware faults and facilitates a simpler and more effective end-to-end experience for system administrators, reducing cost of ownership. This is achieved by isolating and disabling faulty components and then continuing the provision of service through reconfiguration of redundant paths to data, even before an administrator knows there is a problem. The Solaris OS’ reconfiguration agents are integrated with other Solaris OS features such as Solaris Zones and Solaris Resource Manager, which provide a consistent administrative experience and are transparent to applications.

Sun StorageTek Availability Suite

Sun StorageTek Availability Suite software delivers open-source remote-mirror-copy and point-in-time-copy applications as well as a collection of supporting software and utilities. The remote-mirror-copy and point-in-time-copy software enable volumes and/ or their snapshots to be replicated between physically separated servers. Replicated volumes can be used for tape and disk backup, off-host data processing, disaster recovery solutions, content distribution, and other volume-based processing tasks.

Lustre File System

Lustre is Sun’s open-source shared disk file system that is generally used for largescale cluster computing. The Lustre file system is currently used in 15 percent of the top 500 supercomputers in the world, and six of the top 10 supercomputers. Lustre currently supports tens of thousands of nodes, petabytes of data, and billions of files. Development is underway to support one million nodes and trillions of files.


Today’s digital data, Internet applications, and emerging IT markets require new storage architectures that are more open and flexible, and that offer better IT economics. Open storage leverages industry-standard components and open software to build highly scalable, reliable, and affordable enterprise storage systems.

Open storage architectures are already competing with traditional storage architectures in the IT market, especially in Web 2.0 deployments and increasingly in other, more traditional storage markets. Open storage architectures won’t completely replace closed architectures in the near term, but the storage architecture mix in IT datacenters will definitely change over time.

We estimate that open storage architectures will make up just under 12 percent of the market by 2011, fueled by the industry’s need for more scalable and economic storage.

Thursday Jul 10, 2008

Blade Computing - CPU Density vs Memory Density

While there are competitive products that have a better Core density within one Rack than our Constellation System, there is simply no blade solution that can compete with our blade memory density.

The question here realy is, what is the best combination! This leads me back to the question, why blades? One of the main reasons why customers choose the blade technology is:
I am sure that you will agree with me that a combination of 8 Cores together with max 16GB RAM isn't realy the way to run a virtualized infrastructure, right?

So, let's look at the solutions and most innovative Sun Blades we have available at the moment!

Sun Blade T6320 UltraSparc T2 Single Socket Server Module
  • 1x 8 Core UltraSparc T2 CPU
  • 16x Dimm Slots = 64 GB Fully Buffered DDR2 Memory (with 4GB Dimms)
Sun Blade X6220 AMD Opteron Dual Socket Server Module
  • 2x 4 Core AMD Opteron CPUs
  • 16x Dimm Slots = 64 GB ECC DDR2 Memory (with 4GB Dimms)
Sun Blade X6250 Intel Xeon Dual Socket Server Module
  • 2x 4 Core Intel Xeon CPUs
  • 16x Dimm Slots = 64 GB FBDIMM Memory (with 4GB Dimms)
Sun Blade X6450 Intel Xeon Quad Socket Server Module
  • 4x 4 Core Intel Xeon CPUs
  • 24x Dimm Slots = 96 GB DDR2 FBDIMM Memory (with 4GB Dimms)
Please keep in mind that all of those Blades fit into the very same Sun Blade 6000 Chassis or the Sun Blade 6048 Constellation Chassis and share the same unique management interfaces.


To summ up the calculations:
One Constellation Rack in combination with the Sun Blade X6450 provides 768 Cores and 4.608 TB RAM. This are 6GB RAM per Core!

Do you need more RAM per core?
One constellation Rack in combination with the Sun Blade X6250 or X6220 provides 384 Cores and 3.072 TB RAM. This are 8GB RAM per Core!

Do you need Sparc Technology?
One Constellation Rack in combination with the Sun Blade T6320 provides 384 Cores and 3.072 TB RAM. This are also 8GB RAM per Core! Now as we talk about CMT - Chip Multi Threading, this would be 1GB per Thread. Also very interesting is the power consumption for this UltraSparc T2 Blade Solution!

Summary, we can deliver an intelligent Blade Design with the right CPU/RAM combination, and this within the single and standardized Sun Blade 6000 Enclosure.

Wednesday Jun 25, 2008

Sun Vision TV

sunvisiontv.pngSun Vision.TV ist die deutsche Video-Plattform von Sun Microsystems. Sun Experten, Kunden & Partner berichten über die neuesten Systeme, Software, IT-Trends und live von wichtigen Events.

In den 3 Kanälen ist für Jeden was dabei:
  1. "News mit Donatus Schmid"
    mit den neuesten Meldungen & Interviews.

  2. "Welcome to the Machine"
    für Entwickler, SysAdmins, IT-Architekten etc. - garantiert marketingfreie Zone.

  3. "Sun im Einsatz"
    stellt Projekte und Show Cases vor, präsentiert von Kunden & Partnern.
Aus meiner Sicht schlicht und wegs ein genialer Schachzug! Weiter so, Freunde aus Deutschland!

Thursday Jun 19, 2008

Sun Blade X6450 Module, 768 Cores within one Rack!

X6450.pngWe have unleashed yesterday our powerful new Sun Blade X6450 server module at the International Supercomputing Conference (ISC) in Dresden, Germany. This new blade provides industry-leading memory capacity and high-performance Intel Xeon multicore processors, which make it ideal for server consolidation, virtualization applications and high performance computing (HPC).

At A Glance
  • Two or four dual-core or quad-core Intel Xeon Processors
  • Up to 16 Intel Xeon 7000 series (Tigerton) processor cores
  • 50% more memory than competing 4-socket blade servers: excels at virtualization and memory intensive applications
  • 24 DIMM slots per server module
  • Supported in both the Sun Blade 6000, Sun Blade 6048 and Sun Constellation Systems
  • With Sun Blade 6048 up to 71% more compute power than any competing blade system
BC6000.pngNow imagine, one Blade Chassis 6000 with a size of 10u holds 10 blades with four quad core CPU's each! This are 160 Cores within 10u Rack Space! Incredible isn't it?

In combination with a Sun Blade 6048 Constellation Chassis (Rack), you will get 768 Cores within one Rack!

You probably have seen that there are no Disks in the Blade, This is correct. In a Cluster configuration you normaly boot the Blades from the Net. In case you would like to boot localy, the Blade is equiped with a IDE ComactFlash Module Interface or you can attach external SAS Disks to each blade.

In this blog you can find interesting content about solutions and technologies that SUN is developing.

The blog is customer oriented and provides information for architects, chief technologists as well as engineers.


« July 2016