Amplidata provides Optimized Object Storage systems for Big Data. It enables customers to deploy turnkey large-storage infrastructures that meet the highest reliability and availability requirements at the lowest possible cost. The company, founded on 2008, has a European headquarter at Lochristi, in Belgium (with also some customers in Italy). And also one in US, at Redwood City, CA.
The new CEO, Mike Wall, joined the Amplidata board on November 2nd 2011, and will be providing strategic insight and guidance to drive the company’s expansion in the market. It also has recently bring additional $6M funding.
Their solution is based on two different elements:
- Controller node, also called AmpliStor Controller
- Storage node
The AmpliStor Controllers are high-performance, standard Intel Xeon based servers (they have tried also with other CPU like Atom, but without success, due to the computation requirements) that are pre-packaged with Amplidata’s BitSpread software, MetaStore, and management framework. Controllers provide high-performance access over multiple 10 GbE network interfaces, and can serve data over a variety of network protocols including http/REST object interfaces, and language-specific interfaces such as Microsoft .Net, Python or C. Controllers are equipped with additional 10 GbE ports to interface to the back-end storage pool. Controllers operate in a highly available cluster of Controllers, to provide fully shared access to the storage pool, metadata caching in high-performance SSD’s and protection of metadata. Full specifications are provided in the AmpliStor data sheet.
The AS storage nodes are rack unit with 1 U, with different capacity depending by the models (AS20 has 20TB and AS30 has 30TB). To have a good starting system at least 8 blocks are needed (so we can easy talk about petabyte…). Of course the main target of a object storage is all what is unstructured data, like media & entertainment. But object storage is also used but services like Google Drive, Amazon, … Some public cloud solutions (like for example the backup and archiving service provided by McCloud) are based on Amplidata systems.
Scaling is possible by adding more storage nodes and increasing the capacity. But also using a scale out approach with more controller nodes, in this case the throughput and the performance are improved.
Interesting is how they can handle this kind of big data, without using RAID solutions that are not so scalable. They use a technology called erasure coding that is a form of forward error-correction technique that has been used in a variety of technical applications for years. The basic idea is that it enables data to be broken into multiple packets (with a bit of additional information), sent to a receiver, and then reassembled on the receiving side. The key is that the receiver can reassemble the data even if some of the packets are lost in the transmission phase (that is, the receiver has a subset of the original packets). This created a perfect use for erasure codes in deep space transmissions. In this way the rebuild process will take only few hours instead of days of RAID solutions (on big disks).
For more information see also: