During the first day of the third edition of Virtualization Field Day (#VFD3) one of the companies that we (as delegates) met was Pure Storage (in their cool office in Mountain View) that is a company in the storage world and of course also in the virtualization world.
The realize storage both on the software and on the hardware part, although this second one is completely build by Pure Storage, expect the Lego version of their storage (that I suppose was the same used in this contest). Their solutions fits in the all flash enterprise storage approach where only flash technology is used on storage side.
Or to use Pure Storage naming, their solutions are Forever Flash. What does it mean?
With the Forever Flash program you can upgrade your controllers every 3 years to align with the updates that have been engineered by Pure Storage (and if you are renewing the maintenance place), and to top it off the rest of your storage in the chassis then has its support cycle re-aligned with the upgraded hardware. Mainly you have always an up to date hardware on the controller side (consider that the storage industry revolves around a 3-4 year forced refresh cycle, but in this way you have it as an included service).
And of course the controller swap (but also the firmware update of the controllers) could be done on-line always in a non-disruptive way: upgrade controllers between generations, add capacity, update software…all without downtime, performance loss, or planned outages. You won’t even have to let your application team know. Non-Disruptive everything!
The agenda of the meeting was really rich of contents and information:
- Welcome to Pure Storage: an introduction to Pure Storage, their technology, company & culture. Matt Kixmoeller, VP of Marketing, shares the vision of Pure Storage, how the initial efforts to democratize Flash storage have enabled organization to embark on a path towards a flash-fueled enterprise.
- Advancing Virtualization with All Flash Storage: a provocative discussion with Vaughn Stewart, Chief Evangelist, around the challenges of storage with virtualization highlighted by an unlikely root cause. Discussion continues with a review of innovation in the storage industry and dives into the 5 forms of data reduction technology available in the Pure Storage FlashArray.
- VFD3 Surprise: a 100% fun moment
- Everything You Always Wanted to Know About Flash: a technical conversation with Neil Vachhrajani, Software Engineer, one of the brightest minds at Pure Storage discussing NAND flash technology. Topics include SLC, MLC, eMLC, Flash Resiliency, Program and Erase Cycles, Garbage Collection, Over Provisioning, Write Amplification, Data Reduction and more. This is a fantastic session to better understand SSD.
- VMware on Pure Storage Demos with Joel McKelvey, Product Marketing Manager
Their storage are mainly using a scale-in first approach with a dual controllers system and several expansion shelves, the reason to maximizing vertical scale before horizontal scale is because vertical scale is more cost-effective than horizontal scale, horizontal scale becomes important when you need to expand performance beyond what vertical scale can deliver.
There are two different controller series (FA-300 and FA-400 series) that could be connect to different FlashArray Expansion Shelf (each with 2U using 6x 6 Gb/s SAS connections). Cluster/Storage connectivity are provided with InfiniBand links in order to build an active/active high resilient architecture. Front-end connectivity (to the hosts) could be FC or iSCSI depending by the PCI-e cards that you choose.
The software layer is provided by Purity that is the “storage OS” (actually in version 3.0) and was built from scratch around the unique benefits and idiosyncrasies of flash. Purity’s core is FlashCare™, which virtualizes the underlying SSDs into a unified pool. On top of FlashCare run the Purity services that provide resiliency, deduplication and compression, and consistent performance of the FlashArray.
The software layer provide several services and features:
- Compression: about half our data reduction value comes from compression, especially in database-centric use cases.
- Inline deduplication: Pure’s operates at a variable chunk size and detects duplicates down to 512 bytes in size. The smaller the chunk size, the more effective the deduplication.
- Thin provisioning: still at 512-byte chunks.
- Snapshots: Pure Storage offers ZeroSnap snapshot technology, zero-overhead snapshots which enable every snapshot to function like a full clone in terms of performance and usability, but consume zero space overhead and require zero planning.
- Space-Efficient RAID Protection: Pure uses a proprietary version of RAID-6 called RAID-3D. It protects against dual drive failures, acts at the data segment level vs. the drive level, is global across the whole array vs. being stuck in each brick, and has very low overhead.
- Predictable Performance: a huge portion of Pure’s technology goes into how deliver predictable sub-millisecond performance from consumer-grade flash but also optimize flash usage and duration. There was also some mathematics behind this and the Little’s Law was also mentioned (there is also an interesting research on 3D and storage field that maybe has been used, and can explain the strange name of their RAID technology).
As you can notice in the previous schema, the underline storage layer is based on the lower cost MLC Solid State Drives to try to keep their price point low. We have spend some time talking about SLC, eMLC, MLC and TCL flash technologies, but the choice to use MLC drives vs the SLC drives may cause you to have a higher Mean-Time to Failure. However this really depends on how disks are used and Pure Storare claims that they’ve only lost 5 drives in the time they’ve been operating, across all of their customers. Pure Storage would attribute this ability to how their controllers write data to the disks after doing inline deduplication and compression before bothering to write to disk. The FlashArray have a 5-year warranty on all hardware components in the array, including the SSDs.
Could be really costly saving? This is always my doubt when I see a full flash solution where the simple CapEx (but usually also the OpEx) parameters could not be enough. You have to think about the TCO and the ROI of the solution, and maybe not only check per VM cost, but also the per IOPS costs.
The dedicated page on Pure Storage site try to explain the criteria that must be used and this sentence summarize their approach:
At Pure Storage, we have a very simple target: we believe that all-flash storage can be significantly less expensive than traditional spinning-disk storage on a $/GB usable basis, typically 50% the cost of performance disk storage. And if you are willing to also account for space and power savings, as well as for the server consolidation benefits of faster storage, you are likely to find you can save another 50% by moving from spinning disk to all-flash. Even more compellingly, the above analysis assumes only a 5X data reduction, a target Pure Storage has exceeded on nearly all of the customer workloads we have tested to date.
Note that also the Forever Flash approach must also be considered in the TCO of a storage solution, and this could become more interesting when you try to analyze the cost over the third year.
For more information see also:
- Pure Storage @#VFD3 (Tech Field Day 3 – Virtualization)
- Tech Field Day VFD3 – Pure Storage and the all-flash revolution
- Should You Consider Pure Storage as your Next Array?
- #VFD3 Day One – Pure Storage has 99 problems but a disk ain’t one
- The Customer Experience Is The Key To Success In Today’s Array-Based Storage Environment
Disclaimer: I’ve been invited to this event by Gestalt IT and they will paid for accommodation and travels, but I’m not compensated for my time and I’m not obliged to blog. Furthermore, the content is not reviewed, approved or published by any other person than me.