During the last IT Press Tour #57, I’ve got the opportunity to learn about a new (for me) company: MooseFS presented by Jakub Ratajczak (CEO and Co-founder) and Piotr Konopelko (Senior Manger). The company is private-owned and has base in Warsaw, Poland, EU.
MooseFS is a fault-tolerant, highly available, highly performing, scaling-out, network distributed file system. It spreads data over several physical commodity servers, which are visible to the user as one virtual disk.
His architecture is quite different from “traditiona” NAS or SAN storage.
It remember me (in part) the Hadoop architecture with clients, master servers (for the metadata) and chunks servers (for the data):
It’s a NAS or a SAN solutions? It’s more similar to an object storage and actually the network protocol used between the clients and the servers is a proprietary optimized protocol.
Also the storage limits are more similar to a object storage (or a big NAS storage):
- The maximum file size limit in MooseFS is 257 bytes = 128 PiB.
- The maximum filesystem size limit is 264 bytes = 16 EiB = 16 384 PiB
- The maximum number of files, that can be stored on one MooseFS instance is 231 – over 2.1 bln.
All I/O operations transit first from the client to the Master Server, to lookup which chunck servers will be used (also for read operation). Then the data flows directly from the clients to the chunk servers (the data nodes). All replication or erasure coding operations are between the chunk servers. The Master Servers keep all the metadata in memory to increase the performance.
The product starts in late 2005 from an internal project of a distributed filesystem (Gemius that was the 1.0 version). In 2008 version 1.5 was released and was the first public release of MooseFS.
It’s an OpenSource project (available on GitHub) and there is a free version and a Pro version.
Both editions have several features:
- Mission Critical
- High on Performance
- Scalable
- Hardware Independent
- Benefits for a Lifetime
- Big Data Support
- Minimal Investment
- Hardware Durability
- Manufacturer Support
- x86-64 components availability
- Safe Choice and Lifetime Usability
- High Data Availability
- Linux Client
MooseFS Pro has also:
- High Metadata Availability (only manual or via script in the free edition)
- Data Redundancy with Erasure Coding (4+n or 8+n with n up to 9)
- Native Windows Client (Release Candidate)
- Release version can be greater than Free edition
On September 2024 finally also the Free edition reach the 4.x version that brings interesting new features, included a limited version of Erasure Coding (fixed to 8+1 only)!