MAID (Massive Array of Idle Disks) is a new storage technology for long-term, online storage of persistent data that takes advantage of a newer generation of SATA (Serial Advanced Technology Attachment) disk drives that are designed to be powered on and off and improve energy efficiency.
This allows denser packaging of drives in large-scale disk storage systems since only 25% of disks are spinning at any one time.
MAID provides for the power management of disks drives, thus creating a new service level for retaining and accessing “archive/persistent” data.
MAID is an alternative to traditional power-hungry disk-based storage or inexpensive and slow tape backup options. Like tape, MAID disk drives only power up and spin disks when the information on those disks is needed, which makes them well-suited for backup/recovery, replication and archiving applications.
One type of MAID is called the Copan array (Copan Systems Inc.). The Copan array treats drives in the array similar to a tape library (VTL) where only what is needed is actually powered. A Copan array can contain hundreds of terabytes of disks which share supply, controller, and cabinet. According to Copan co-founder Will Layton, MAID technology is able to bring the largest amount of data on-line, and does it in the most efficient way possible.
Other vendors with MAID technology storage products include Nexan Technologies, Dell-EMC, Fujitsu, HDS, NEC…
Compared to RAID, MAID falls in the non-RAID solutions, compared to RAID technology, MAID has increased storage density, and decreased cost, electrical power, and cooling requirements. However, these advantages are at the cost of much increased latency, significantly lower throughput, and decreased redundancy. Redoundacy and parallelism should be managed at application level, and the focus is to optimize the cost of the solution (expecially the operation costs).
MAID is designed for the “Write Once, Read Occasionally” (WORO) applications use cases, in which increased storage density and decreased cost are traded for increased latency and decreased redundancy.
Below are some of the most appealing reasons why MAID may fail:
- Applications must be MAID aware as written before, otherwise MAID will provide worste resiliency and bad performance.
- Frequent powering down and up the disks lead to disk reliability issues for failed LUNs.
- Chances of disk failures are much higher when drives are spun up and down frequently… not all the disks are designed to have serveral power cycle!
- Power savings from idle drives may be actually eaten up by the additional power required to bring the drives on.
- $/GB never met the expectations set by the industry experts.
- Backup and archival environments using Tape technology have been seen as a long term investments in the industry even today and any new change is likely unwelcome.
- Lack of software tools to identify archival data for MAID appropriately that means there is lack of intelligence to determine where data resides i.e., spinning or idle disks.
- Lack of standards make each MAID implementation different from others.
- Moreover there are much still too much FUD (Fear, Uncertainty and Doubt) around MAID storage adoption.
As written, implementation could be different, but somehow you need a cache layer with metadata and some other type of frequent data always online:
Usally the cache disks are SSD drives, but not necessary. Data disks instead are alway the cheapest layer and typically HDD that can be powered off when not needed.