This post is also available in: Italian

Reading Time: 5 minutes

Rozo Systems announced the v2.0 of their its RozoFS scale‐out NAS software, a solution scalable to hundreds of Petabytes, with a single global namespaces, multihead, with multiple access protocols using NFS, CIFS and also Objects.

This French company (based in Nantes and San Francisco) has started the idea in late 2005 from a lab testing research in University of Nantes and formally the company has born in 2010 as a spin-off of this research group. Version 1.0 of their product was released on 2013. Actually they have more than 10 employees and more than 10 production deployment of their product.

During the last IT Press Tour #17 I’ve got the opportunity to learn more about this company  and their solutions. Pierre Evenou (CEO at Rozo Systems) and Michel Courtoy (COO) were on stage to present us their unique value.

The core of its patented technology is a unique erasure coding algorithm with really interesting performance (more than 3x faster than the Intel-ISA-L implementation). It give protection level of 5 copies with only 1.5 redundancy while providing striping performance.


The product is a true software defined storage solution that can run on any standard x86 servers powered by Linux using a custom global and distribuited filesystem (RozoFS).

Also it is a scale‐out NAS that delivers high performance on all data sizes, a crucial requirement in many applications, including media and entertainment and high performance computing. Its superior performance is associated with a disruptive erasure coding technology that works on files of all sizes.

User cases are almost the same for each (big, but not huge) file server solution, including research, science, education, media and entertainment. They declare an optimal target size around 10 PB, to provide a real-time high performance storage.

One interesting user case, discussed during this meeting, was the Cloud DVR:

Cloud-DVRIn this case, their solution provide better results for the customers with more viewing flexibility (like shifting of time, location, more devices) and can overcome the storage and tuner limitations of local DVR. But also some advances for the the providers with a reduced cost to the providers (STB; central storage; support calls and truck rolls) and a new revenue models (updated and targeted commercials; upsell of new services; better experience).

The key new features of RozoFS version 2.0 include:

  • 128‐bit erasure coding: Boosts the encoding/decoding performance by leveraging the 128‐bit instructions of the x86 processors. The result is throughput of 10 GB/s on
    4KB blocks.
  • Local auto‐repair: Supports the auto‐repair (‘self healing’) of one failed disk on the other disks of the same storage node, preserving the redundancy level for each file.
  • Quota per user and per group: Implements a familiar Linux‐like quota feature. Enforcement can be configured on the fly and accounting is always on.
  • Multi‐access modes: Supports three modes simultaneously: hierarchical access, the relative mode (parent/child) and the direct mode. The direct mode bypasses the metadata server for increased performance.
  • Data integrity and self‐healing: Protects each block on disk with a CRC32‐C checksum, eliminating all error sources between disk and application. In case of data
    corruption, RozoFS repairs the faulty block on the fly.
  • Geo‐replication: Addresses the case where fewer than four sites are available. The geo‐replication is always asynchronous with a configurable replication rate. By default it operates in Active/Standby mode, with Active/Active also supported.
  • Metadata optimization: Improves metadata operations significantly by introducing time and space notions in the metadata structure. For example, RozoFS creates files at a rate of 38,000 files/sec per export. RozoFS also comes with popular Linux tools such as “find“ that has been adapted to RozoFS’ metadata structures. It enables indexing rates that exceed one million i‐nodes per second; higher rates can be achieved when time is given as input criteria.
  • Split of the data and metadata flows: Avoids slow‐downs of metadata operations (SSD) under heavy data (HDD) read/write loads.
  • 256K I/O support: Provides better performance for sequential access and limits the randomness at disk level.

As in the previous version, there are two editions:

  • Community edition available on GitHub and with GNU GPL v2 license
  • Advanced edition, with a fee based license and with some advanced features, like the optimized EC

Disclaimer: Condor Consulting Group has invited me to this even and they will paid for accommodation and travels, but I am not compensated for my time and I’m not obliged to blog. Furthermore, the content is not reviewed, approved or published by any other person than me.