Sunday Mar 21, 2010

Hadoop AvatarNode

Hadoop AvatarNode

HDFS clients are configured to access the AvatarNode via a Virtual IP Address (VIP)
When PrimaryAvatarNode is down,  the Standby AvatarNode takes the relay
The Standby AvatarNode ingests all committed transactions because it reopens the edits log and consumes all transactions until the end of the file
The Standby AvatarNode finishes ingestion of all transactions from the shared NFS filer and then leaves SafeMode
The VIP switches from Primary AvatarNode to Standby AvatarNode

In "blue" the AvatarNode before the failure
In "red" the AvatarNode after the failure
The servers are alternatively Primary or Standby AvatarNode.

See the publication from Dhruba Borthakur


Sunday Nov 01, 2009

MogileFS Architecture

MogileFS Architecture
MogileFS  is an open source distributed filesystem, flexible and high available on a network of commodity hardware.
MogileFS is an anagram for "OMG Files" and was created for
LiveJounal to handle the storage, replication and retrieval of the large amount of file uploads. MogileFS is a Danga's Interactive project. Six Apart has acquired Danga Interactive in 2006.

Who used MogileFS :
LiveJounal, Digg, Skyrock, Wikispaces, Friendster

Key Enablers

- A scalable, Fault tolerant, High performance distributed file system
- No Single Point of Failure
- Automatic file replication (3 replications recommanded)
- Better than RAID
- Flat NameSpace
- Share-Nothing
- No RAID required
- Local filesystem agnostic
- Tracker client transfert (mogilefsd) - Replication -- Deletion - Query - Reaper - Monitor
- Files are broken up and spread over the Storage Node (mogstored) HTTP and WebDAV server
- Database MySQL stores the MogileFS metadata (the namespace, and which files are where)
- Client Library : Ruby, Perl, Java, Python, PHP…

High Availability

- For increasing the high availability of the MogileFS it is possible to interconnect 2 database servers (active/passive) with Solaris Cluster
- 2 Trackers nodes for availability and one for the load balancing


- For the security of the MogileFS cluster you should encrypted the data for safeguarding all transactions on the web.

Proof Of Concept

- Create an architecture with three servers (tracker, database, storage node) and test the performance and the feasibility of MogileFS.
- For rapidly testing MogileFS you can create 3 Solaris Containers (tracker, database, storage node) on the same physical server.


- Interface your application with MogilesFS and implement the "Save as Cloud..." and  "
Open from Cloud...". functionalities.

Service and Support

- MogileFS support with

Architecture Overview

Sizing for HA Cluster

- Business Data Volume = Customer needs
- No RAID factor, No HBA port
- 2 CPU Quad-core / 32 GB RAM for all servers
- 2 System hard disks
- Number of replication blocks = 3
- Block size = 128 MB
- Raw Data Volume = Business Data Volume \* Nb of replication blocks
- Number of Database Servers = 2
- Number of Tracker Servers = 3 minimum
- Number of Storage Node Servers = Raw Data Volume / Server Capacity Storage

Key Links


Sunday Jul 19, 2009

Hadoop Architecture

Hadoop Architecture
Cloud computing is a convergence of High Performance Computing architectures, Web 2.0 data models, and Enterprise computing data scale.
Cloud Analytics should leverage Sun's compelling storage architecture.

Hadoop Distributed File System (HDFS)
is scalable with high availability and high performance. HDFS on servers with 3 cluster nodes minimum (1 Master Node and 2 Slaves Nodes). The blocks data are 64 MB (default) / 128 MB, every block is replicated  3 times (default). NameNode is the metadata of the file system. The files are divided and distributed on DataNodes.
MapReduce is a data processing software and is designed to store and stream extremely large datasets in batch, not intended for realtime querying and does not support ramdom access. JobTracker schedules and manages jobs, TaskTracker executes individual map() and reduce() tasks on each cluster node.
HBase is distributed storage system, column-oriented and multi-dimensional, This software is very interesting to manage very large structured data for the web semantic. HBase can manage billions of rows, millions of columns, thousands of versions and petabytes across thousands of servers. Realtime querying.
Hive is a system for managing and querying structured data built on top of Hadoop with SQL as data warehousing tool. No realtime querying

High Availability

- The NameNode is a single point of failure (SPOF), the transaction Log is stored in multiple directories and a directory is on the local file system or on a remote file system (NFS/CIFS).
- The secondary NameNode is the copies of FsImage and Transaction Log from NameNode to a temporary directory.
- For increasing the high availability of the Hadoop cluster it is possible to interconnect 2 master nodes (active/passive) servers with Solaris Cluster


- For the security of the Hadoop cluster you should encrypted the data for safeguarding all transactions on the web.

Proof Of Concept

- Create an architecture with minimum three nodes and test the performance and the feasibility of Hadoop.
- For rapidly testing Hadoop you can use the OpenSolaris Hadoop Live CD
- The OpenSolaris LiveHadoop setup install three virtual nodes Hadoop Cluster
        - Once OpenSolaris boots, two virtual servers are created using Zones
        - Zones are very lightweight, minimizing virtualization overheads and leaving more memory for your application
        - The "Global" zone hosts the NameNode and JobTracker, and two "Local" zones each host a DataNode and TaskTracker


- Interface your application with HDFS and implement the "Save as Cloud..." and  "Open from Cloud...". functionalities. Use the Hadoop Java API for your development.

Service and Support

- HDFS, MapReduce, HBase and Hive are Open Source software and supported on OpenSolaris.
- For the US countries it is possible to contact Cloudera for bringing big data to the enterprise with Hadoop.
- Who support Hadoop across the globe ?

Architecture Overview

Sizing for HA Cluster

- Business Data Volume = Customer needs
- No RAID factor, No HBA port
- 2 CPU Quad-core for all servers
- 2 System hard disks
- Number of replication blocks = 3
- Block size = 128 MB
- Temporary Space = 25% of the total hard disk
- Raw Data Volume = 1.25 \* (Business Data Volume \* Nb of replication blocks)
- Number of NameNode Servers = 2
- Number of DataNode Servers = Raw Data Volume / Server Capacity Storage
- NameNode RAM = 64 GB
- DataNode RAM = 32 GB mini

Key Links


Friday Jun 06, 2008

Cloud Computing

Cloud Computing Sun and Clouds

I think that the Cloud Computing concept exists for several years but the technologies are now available and mature to be implemented in datacenter.
Cloud computing is a real business opportunity for service providers and outsourcing companies. They will be able to manage many datacenters across the world in different countries with lower total cost of ownership. According to me, Cloud Computing is the result of 2 major technologies, the Grid Computing and the Virtualization on servers, storage, network and desktop. Imagine many datacenters distributed in the world and managed as a unique resource. It is now possible in the real life with the new technologies !
The major difficulty for the Cloud computing is the infrastructure scablabilty distributed in any geographic points. If an application has need of more resources unavailable in one datacenter, the Cloud Computing must run simultaneously the application process on a second datacenter and so on.

Sun Value Proposition

  • AMD, INTEL, CMT Processors blades in the same box
  • Multi OS : Linux, Solaris, Windows
  • High Performance Network Gigabit, 10G or Infiniband. Reduction of cabling with switch Magnum
  • Sun Blade 6048 Modular System
  • Sun Datacenter Switch 3456
  • Sun StorageTek J4xxx
  • Sun Storage 7000 Unified Storage System
  • High Performance Storage (Lustre, pNFS, Sun Fire x4540 48TB, SAM-FS Archiving)
  • Sun Studio 12 (for free)
  • Sun Grid Engine (Open Source)
  • Sun HPC Cluster Tools (OpenMPI)
  • Hadoop : Distributed applications with high density of data
  • MogileFS: File System  with horizontal storage extension on unlimited number of machines
  • Dynamic System Domains, Solaris Containers, VMWare, Microsoft Virtual Server
  • Sun xVM Infrastructure with Sun xVM Server ( LDom, Xen) and Sun xVM Ops Center
  • Solaris Cluster and Geo Cluster Edition
  • Storage virtualization : Sun StorageTek 99xxV and Sun Virtual Tape Library, Solaris ZFS
  • Sun Virtual Desktop Infrastructure Software
  • VirtualBox (Client virtualization)

Sunday Jun 01, 2008

Follow the Sun

Follow The Sun The Helios-synchronous Dynamic Architecture

The best architecture for sales management of an multinational compagny. The system performance must be on top for user activity and data Integration. User activity and data integration are in day and night alternation for every time zone across the world. The system must follow the sun and be synchronized with users activity. Nevertheless the data integration activity being done the night, a reserve of power must be allocated at every time zone to guarantee the system availability and performance. It is thus necessary to design a dynamic system according to the days and nights alternation.

This solution is based on SAP BI Software and Solaris Containers.
  • Solaris 10 and Solaris Resource Manager
  • Resources guaranteed for any Time Zone
  • One Local Zone/Time Zone per AS Instances
  • Resources consolidation
  • Global Zone for DB/CI Instances
  • User activity the Day & Data Integration the Night
  • Resources in Day & Night Alternation
  • Dynamic integration for new country

Friday Feb 15, 2008

BI Architecture Design

BI Architecture Design The Best Architecture for Business Driving

If you want to size this type of architecture you must read the BI rules and definitions here

Sizing Methodology
The major parameters for sizing business intelligence technical architecture are : Concurrent queries launched by users (low, medium, high), Processor (type, frequency), Operating System (name, version), Tools Analysis and Databases (name, version), Data (raw data volume, usesable data volume), Data flow (size, frequency, timing, complexity, period) and aggregates building.

Architecture Design example
• Users activity in different time zone (ex: France, Japan, Brasil, Australia...)
Standardization : best practices ITIL v3, servers and storage consolidation
• Virtualization : servers virtualization (Solaris Containers cloned by country) and storage virtualization Sun StorageTek 9990V
• Dynamic infrastructure : more flexibilty for dynamic user and data integration. Resource management with Solaris Resource Manager (SRM) for data integration vs users activity
performance. Data replication for high availability.
• Gouvernance : Performance and cost management


Tuesday Feb 12, 2008

BI Architecture Definition

BI Architecture Definition Understanding Business Intelligence Rules and Definitions

If you want to understand the Business Intelligence and design the best architecture for the customers needs, then, you must know of them the rules and definitions. It is the best way of being able to dialogue more easily with the specialists.

What is Business Intelligence ?
Business intelligence (BI) is a broad category of applications and technologies for gathering, storing, analyzing, and providing access to data to help enterprise users make better business decisions. BI applications include the activities of decision support systems, query and reporting, online analytical processing (OLAP), statistical analysis, forecasting, and data mining. Business intelligence applications can be: Mission-critical and integral to an enterprise's operations or occasional to meet a special requirement Enterprise-wide or local to one division, department, or project. Centrally initiated or driven by user demand.

Raw Data vs Usable Data
Raw data is the data source resulting from the operational systems (CRM, RH, BILLING, PURCHASES, SCM).
Usable data is the result of raw data and technical data according database organization, like indexes, aggregates, metadata, axis, indicators and data work. Usable data does not include Raid factor.

Data Structure

The database is structured in 3 levels: Staging Area is the storage area for data validation. Data Warehouse is the storage area for data details and metadata (ex. Oracle, DB2...) and Data Marts is the storage area for business data including axis, indicators and aggregates (ex: Oracle, DB2, Sybase, Essabse...)

Users Activity

Named users may reach the Business Intelligence system. Users perform concurrent access to BI system ressources. Low users perform reporting by means of requests sweeping around 1.000 records. Medium users perform navigation and analysis around 100.000 records. High users perform ad hoc navigation and analyze large volumes of data with several joints of tables or full facts table scan around 1 million records.

Extraction, Transformation, Loading

Data Integration is more or less complex according to the transformation topics that they perform.
Simple processing represents simple calculations, simple concatenations. Medium processing represents average calculations, medium concatenations. Heavy processing represents heavy calculations, statistical, complex algorithms and heavy concatenations


Software is classified according to several technologies topics: ETL Tools for extraction, transformation and data loading (PowerCenter, DataStage, AbItinio...). Relational database (RDMS) is an entitie/relation data structure (ex: Oracle, Sybase, SAS Base, DB2...). Multidimensional database (MDMS) is a matrix data structure stored on disk (ex: Essbase, Powerplay...). Reporting/Analysis tools (ex: Business Objects, Cognos, SAS, SAP/BI...)

Time Management

The Business Intelligence system is different from the operational system because it integrates the time factor.
Time management is very important in Business Intelligence: data retention duration (ex: 3 years), operational period (ex: Monday - Friday), operation frequency (ex: daily) and associated time frame (ex: 08:00 AM - 07:00 PM)


Business stakes are changing, the IT infrastructure must be increasingly reactive to significantly reduce Time To Market. Today, we have the technology and methodology addressing these new business challenges.


« June 2016