By Philippe Julio on Nov 01, 2009
MogileFS is an open source distributed filesystem, flexible and high available on a network of commodity hardware.
MogileFS is an anagram for "OMG Files" and was created for LiveJounal to handle the storage, replication and retrieval of the large amount of file uploads. MogileFS is a Danga's Interactive project. Six Apart has acquired Danga Interactive in 2006.
Who used MogileFS : LiveJounal, Digg, Skyrock, Wikispaces, Friendster
- A scalable, Fault tolerant, High performance distributed file system
- No Single Point of Failure
- Automatic file replication (3 replications recommanded)
- Better than RAID
- Flat NameSpace
- No RAID required
- Local filesystem agnostic
- Tracker client transfert (mogilefsd) - Replication -- Deletion - Query - Reaper - Monitor
- Files are broken up and spread over the Storage Node (mogstored) HTTP and WebDAV server
- Database MySQL stores the MogileFS metadata (the namespace, and which files are where)
- Client Library : Ruby, Perl, Java, Python, PHP…
- For increasing the high availability of the MogileFS it is possible to interconnect 2 database servers (active/passive) with Solaris Cluster
- 2 Trackers nodes for availability and one for the load balancing
- For the security of the MogileFS cluster you should encrypted the data for safeguarding all transactions on the web.
Proof Of Concept
- Create an architecture with three servers (tracker, database, storage node) and test the performance and the feasibility of MogileFS.
- For rapidly testing MogileFS you can create 3 Solaris Containers (tracker, database, storage node) on the same physical server.
- Interface your application with MogilesFS and implement the "Save as Cloud..." and "Open from Cloud...". functionalities.
Service and Support
- MogileFS support with http://www.sixapart.com
Sizing for HA Cluster
- Business Data Volume = Customer needs
- No RAID factor, No HBA port
- 2 CPU Quad-core / 32 GB RAM for all servers
- 2 System hard disks
- Number of replication blocks = 3
- Block size = 128 MB
- Raw Data Volume = Business Data Volume \* Nb of replication blocks
- Number of Database Servers = 2
- Number of Tracker Servers = 3 minimum
- Number of Storage Node Servers = Raw Data Volume / Server Capacity Storage