Sunday Mar 21, 2010

Hadoop AvatarNode

Hadoop AvatarNode

HDFS clients are configured to access the AvatarNode via a Virtual IP Address (VIP)
When PrimaryAvatarNode is down,  the Standby AvatarNode takes the relay
The Standby AvatarNode ingests all committed transactions because it reopens the edits log and consumes all transactions until the end of the file
The Standby AvatarNode finishes ingestion of all transactions from the shared NFS filer and then leaves SafeMode
The VIP switches from Primary AvatarNode to Standby AvatarNode

In "blue" the AvatarNode before the failure
In "red" the AvatarNode after the failure
The servers are alternatively Primary or Standby AvatarNode.

See the publication from Dhruba Borthakur


Sunday Nov 01, 2009

MogileFS Architecture

MogileFS Architecture
MogileFS  is an open source distributed filesystem, flexible and high available on a network of commodity hardware.
MogileFS is an anagram for "OMG Files" and was created for
LiveJounal to handle the storage, replication and retrieval of the large amount of file uploads. MogileFS is a Danga's Interactive project. Six Apart has acquired Danga Interactive in 2006.

Who used MogileFS :
LiveJounal, Digg, Skyrock, Wikispaces, Friendster

Key Enablers

- A scalable, Fault tolerant, High performance distributed file system
- No Single Point of Failure
- Automatic file replication (3 replications recommanded)
- Better than RAID
- Flat NameSpace
- Share-Nothing
- No RAID required
- Local filesystem agnostic
- Tracker client transfert (mogilefsd) - Replication -- Deletion - Query - Reaper - Monitor
- Files are broken up and spread over the Storage Node (mogstored) HTTP and WebDAV server
- Database MySQL stores the MogileFS metadata (the namespace, and which files are where)
- Client Library : Ruby, Perl, Java, Python, PHP…

High Availability

- For increasing the high availability of the MogileFS it is possible to interconnect 2 database servers (active/passive) with Solaris Cluster
- 2 Trackers nodes for availability and one for the load balancing


- For the security of the MogileFS cluster you should encrypted the data for safeguarding all transactions on the web.

Proof Of Concept

- Create an architecture with three servers (tracker, database, storage node) and test the performance and the feasibility of MogileFS.
- For rapidly testing MogileFS you can create 3 Solaris Containers (tracker, database, storage node) on the same physical server.


- Interface your application with MogilesFS and implement the "Save as Cloud..." and  "
Open from Cloud...". functionalities.

Service and Support

- MogileFS support with

Architecture Overview

Sizing for HA Cluster

- Business Data Volume = Customer needs
- No RAID factor, No HBA port
- 2 CPU Quad-core / 32 GB RAM for all servers
- 2 System hard disks
- Number of replication blocks = 3
- Block size = 128 MB
- Raw Data Volume = Business Data Volume \* Nb of replication blocks
- Number of Database Servers = 2
- Number of Tracker Servers = 3 minimum
- Number of Storage Node Servers = Raw Data Volume / Server Capacity Storage

Key Links


Saturday Feb 14, 2009


Open Storage   The Best Performance At The Best Price
Open Storage

Key Business Drivers
  • Manufacturing : Costs reduction – Eco responsibility
  • Telecommunications : Outsourcing - Costs reduction – Eco responsibility
  • Banking & Finance : Increase banking and financial transactions, Inheritance optimization and mangement  - Costs reduction
  • Government : Accomplish more work with fewer resources – Eco responsibility
  • Retail : Need to manage profitability and control expenses – Eco responsibility
  • Media & Entertainment : On-going technological innovation – Costs reduction – Eco responsibility
  • Healthcare : Accelerating employers-led initiatives  - Costs reduction – Eco resposibility
  • Education & Research : Enable anytime, anywhere access - Creating a new form of collaborative education  - Costs reduction – Eco resposibility
  • Transportation & Travel : Social responsibility - Technology, exposure to other cultures - Collecting and sharing experiences - Business travel online adoption - Green initiatives - Costs reduction
  • Energy : Energy education  - Carbon emissions reduction – Power consumption reduction - Energy cost reduction
  • Pharmaceutical : Enhanced information dissemination - Costs reduction - Eco  responsibility
  • IT Outsourcing : Resources consolidation - Green sourcing - Costs reduction
IT Drivers
  • Increase data volume and processing
  • Speed up deployment new services
  • Green IT
  • Infrastructure consolidation
  • IT costs reduction
  • Open Source componants
  • Data management simplification
  • OpenStorage Strategy : Freedom of use - More material choice - More suppliers - Larger community users
  • Sun Storage 7000 Unified Storage Systems Appliance with SATA, SAS and SSD  technologies, JBOD Array, Opteron processors
  • OpenSolaris
  • ZFS Services : Snapshot, Encryption, Replication, Compression, RAID-Z, De-Duplication, Media Management, 1600 PB, Virtual Pools, Dynamic Stripping, Snapshot, Compression embeded, Administration simplified
  • Data Protocols: FS v3 and v4, CIFS, ISCSI, HTTP, WebDAV, FTP, NDMP v4
  • Data Services : Flash Hybrid Storage Pool, RAID-Z (5), RAID-Z DP (6), Mirroring, Striping, Active-active Clustering, Remote Replication, Antivirus via ICAP Protocol, Snapshots, Clones, Compression, Thin Provisioning, End-to-End Data Integrity, Multi-Path I/O, Fault Management
  • Management : DTrace Analytics, Dashboards, Role-Based Access Control, NIS LDAP & AD Alerts, Phone Home, SNMP, Scripting, Upgrade Hardware View, Advanced Networking
Key Performance Indicators
  • Power consumption
  • Return On Investment
  • Total Cost of Ownership
  • Time to deploy a new service
  • Number of Open Source components
  • Number of contributors
  • Economies made by the OpenSource choice
  • Service quality
Added Value Services
  • OpenStorage Workshop
  • Product Deployment Services
  • Sun Learning Services
  • Sun Managed Services
  • Sun Support Services
  • Sun Global Financial Services Operation
  • Objective : Cost effective network unified storage solution. Reduce administration. Reduce reliance on platform/OS knowledge
  • Solution : Sun Storage 7410 Cluster with 2 x J4400. On-board Flash Disk for increased data read performance. Managed Ops contract was uplifted to 3 year 24x7 gold support
  • Customer Benefit : Open Source approach. ZFS - today and future capabilities (Pooled storage). Price. User interface. SSD Integration, an easy and inexpensive expansion


Friday Jun 06, 2008

Cloud Computing

Cloud Computing Sun and Clouds

I think that the Cloud Computing concept exists for several years but the technologies are now available and mature to be implemented in datacenter.
Cloud computing is a real business opportunity for service providers and outsourcing companies. They will be able to manage many datacenters across the world in different countries with lower total cost of ownership. According to me, Cloud Computing is the result of 2 major technologies, the Grid Computing and the Virtualization on servers, storage, network and desktop. Imagine many datacenters distributed in the world and managed as a unique resource. It is now possible in the real life with the new technologies !
The major difficulty for the Cloud computing is the infrastructure scablabilty distributed in any geographic points. If an application has need of more resources unavailable in one datacenter, the Cloud Computing must run simultaneously the application process on a second datacenter and so on.

Sun Value Proposition

  • AMD, INTEL, CMT Processors blades in the same box
  • Multi OS : Linux, Solaris, Windows
  • High Performance Network Gigabit, 10G or Infiniband. Reduction of cabling with switch Magnum
  • Sun Blade 6048 Modular System
  • Sun Datacenter Switch 3456
  • Sun StorageTek J4xxx
  • Sun Storage 7000 Unified Storage System
  • High Performance Storage (Lustre, pNFS, Sun Fire x4540 48TB, SAM-FS Archiving)
  • Sun Studio 12 (for free)
  • Sun Grid Engine (Open Source)
  • Sun HPC Cluster Tools (OpenMPI)
  • Hadoop : Distributed applications with high density of data
  • MogileFS: File System  with horizontal storage extension on unlimited number of machines
  • Dynamic System Domains, Solaris Containers, VMWare, Microsoft Virtual Server
  • Sun xVM Infrastructure with Sun xVM Server ( LDom, Xen) and Sun xVM Ops Center
  • Solaris Cluster and Geo Cluster Edition
  • Storage virtualization : Sun StorageTek 99xxV and Sun Virtual Tape Library, Solaris ZFS
  • Sun Virtual Desktop Infrastructure Software
  • VirtualBox (Client virtualization)

Monday Jan 28, 2008

IT Value Propositions

IT Value Propositions

That's a unique Sun IT Propositions which brings Value to the Company

It is an important part of business value proposition as it shows our core-business :
- Our assets compared with competitors
- Our services capabilities
- A significant reference including figured customer benefit
- Functional and technical indicators to drive solution performance

The IT model is based on 5 axis :
  Scalability/Power : Horizontal/Vertical Scalability, Power (CPU, I/O...)
  ECO : Economy (Costs), Ecology (KVA, RoHS, WEEE...)
  Security : (Data, Access...)
  Availability : (Clustering, Components redondancy...)
  Flexibility : (Virtualization, Provisioning...)

A large part of Sun's IT value proposition is based on the fact that we master all the key elements of the IT value chain.
It does not mean we cannot address heterogeneous environments, but it creates the conditions to deliver strong IT solutions to our customers.
We know how to address a broader range of needs and when we answer a business problem from one of our customers,
we are in a position to consider all the aspects of it. This is a strong differentiator compared to some of our competitors who are specialized in one area.

We have defined the Sun IT value propositions that can be seen as templates of the “Business/IT Alignement Approch” which are instantiated when we address a particular customer.
A given IT value proposition defines the typical key performance indicators that we use. It also describes the unique assets and services that Sun owns and that makes Sun proposition unique on the market. Finally, a real life customer experience is presented.

Sun IT Value Propositions

  1. Industrialization and Best Practices : Products/Services, IT processes industrialization and best practices
  2. Standardized technical basis : Normalization and management of technical basis evolutions, architecture principles
  3. Optimization of computer rooms : physical room optimization, consolidation, cooling and electric security
  4. Provisioning : Environment analysis, monitoring and deployments automation
  5. Infrastructure Virtualization : Utilization ratio improvement, infrastructure flexibility
  6. Desktop Virtualization : Access to applications from everywhere in the world with complete security
  7. Web 2.0 : Technologies and Web use for next Internet generation
  8. Eco Datacenter : Economical and Ecological infrastructure for Datacenter
  9. Open Source : Freedom and software components choice
  10. Disaster Recovery Plan : Infrastructure for disaster recovery
  11. Infrastructure Business Application : Technical infrastructure for ERP, Business Intelligence, Data Warehousing
  12. High Performance Computing : Parallel computing grids
  13. Business Continuity : Availability and security infrastructure according  to Service Level Agreement
  14. Identity Management : Users identification and access management
  15. Security : Information access in full security
  16. Archiving : Data Management from its creation to its destruction. Data archiving
  17. Data Protection : Backup, restore, data replication
  18. Services Oriented Architecture : Systems interoperability, Web Services
  19. x86 : Servers and software with high performance at low cost
  20. CMT : Servers and software with high performance at low cost
  21. Cloud Computing : A Software Design and a Set Of Architectures (Grid Computing and Virtualization)

Saturday Jan 19, 2008

Eco Datacenter

Eco Datacenter Ecology it's good for planet and good for business

OpenEco is a global on-line community that provides free, easy-to-use tools to help participants assess, track, and compare energy performance, share proven best practices to reduce greenhouse gas (GHG) emissions, and encourage sustainable innovation. more

March 21, 2007
- Today is International Earth Day, a day celebrated each year around the world on the vernal equinox. It's also a good time to remind ourselves that even small changes in the way we conduct business can have a big impact on our environment.
At Sun, eco responsibility is about changing the way we approach business, IT, and the environment through sustainable computing. To do that, we innovate, act, and share. more

Our Technology Assets

  • The own Sun experience on his Santa Clara's Datacenter (USA)
       Click on this photo

Tuesday Jan 15, 2008

IT Trends

IT Trends

The Sun solution aligned on the Top Strategic Technologies 2009

Cloud Computing
Servers - Beyond Blade Servers
Green IT
Web-Oriented Architectures
Enterprise Mashups
Specialized Systems
Social Software and Social Networking
Unified Communications
Business Intelligence

IT Trends according Gartner more



Business stakes are changing, the IT infrastructure must be increasingly reactive to significantly reduce Time To Market. Today, we have the technology and methodology addressing these new business challenges.


« July 2016