Converged infrastructure. I love the name because it’s not something I saw in a marketing pamphlet, I heard it from another engineer and admin who used the term to describe another product that is very similar (yet quite different). It stuck with me because the name _actually relates to the products_. Converged infrastructure means a system, a datacenter platform, in a box. A platform that can be all you need (minus the networking, for now) to keep a company afloat.
The two big names this year are Scale Computing and Nutanix. While they may cross paths a bit in the middle, it seems to me they are targeting two different customer bases. Scale is bringing this technology to an audience who has been shirked by the enterprise friendly VMware features (and pricing structure), allowing smaller companies to enter the virtualization and storage age, even VDI, without breaking the bank. Nutanix on the other hand, is taking VMware as it is and enhancing the manageability, removing overhead and putting it all into a little box with a lot of horsepower. This year at VMworld I had the chance to meet with Nutanix and get the skinny on what they’ve got. In the Tech Field Day briefing prior to the round-table recording, Dheeraj Pandey, Founder and CEO of Nutanix, filled us in on the company and technology behind their product.
The system is a compute and storage platform collapsed into one box that clusters. It’s a series of storage attached servers running hypervisors that scales out as needs grow and change. Single points of failure are eliminated and there is a decoupling of HA and storage management down to simple policy creation. Mainly marketed and deployed as a VDI deployment (but not limited to), Pandey describes the product as "low friction, high velocity VMs" that can take care of all needs for the infrastructure. Anything the customer might need from VDI brokers to user space to desktop VMs and more can be run easily within their boxes. This allows them multiple insertion points into a company; via the storage, desktop, or server teams, which could end up being the driving factor in their apparent run at mid-market domination.
Currently Nutanix only supports VMware, but Hyper-V and KVM are being worked on in the labs with possible releases sooner than later. With Hyper-V making a move at enterprise features and KVM gaining some popularity (and the low cost of running both) it makes perfect sense to being looking at them. Additionally, the data path is independent of the hypervisor making it a fairly simple task to place any hypervisor on top (saying that from this side of the keyboard is easy). The integration comes via the control plane and work on porting the platform to give customers a bigger choice is underway.
The starting point when ordering from Nutanix is 3 nodes, which gives redundancy and the ability to grow before needing to buy a new chassis. Each cluster contains up to 4 nodes (motherboard, processors, and RAM) plus the system storage in one chassis. The clusters are capable of up to 2TB of RAM, have 8 processor sockets, and a 40Gb (a single 10Gb interface per node) connection to the network. There are 24 drive bays available to be filled, but a fully populated system will lose 4 of those drives to the nodes as a boot and system utility (logging, calculations, etc) drive. The storage consists of a mix of SAS spindle drive, SSD, and Fusion-IO to allow for data tiering, metadata storage, and redundancy. There is no RAID controller (thus no hardware dependency) and all redundancy is managed through their proprietary software.
The goal of Nutanix is to provide 5TB of storage per node. When a new box is added and brought up on the network (L2 adjacency), the systems utilize Bonjour to auto discover new neighbors and add them to existing clusters. The company policy of "throw another box at it" has worked to this point, giving more storage and compute as the customer requirements grow.
*Lab testing has run 52 nodes in a single cluster (13 chassis) successfully, the only breaking point was the cost of spinning up that much hardware, and the system can theoretically scale beyond 1,000 nodes.
Within the storage system, it is possible to physically isolate and tier your data utilizing the 3 different types of storage. Carving our storage pools from the disks which can then be provisioned into containers for your VMDKs allows granular control for which servers get access to which disks. Each host has a resident VM storage controller in a parallel datacenter architecture (that can not be vMotioned due to being directly attached to the storage) and it serves out the disks as NFS or iSCSI to ESX. Deduplication and data compression are also handled by the storage controller VM.
Since the disks are directly attached to the storage controller VM, no data has to traverse the network in order to be served out, all traffic beings and ends in the vSwitch. There are heartbeat mechanisms that will move a guest VM to a peer storage controller (and node) should there be a failure or it is unable to reach the storage controller for any other reason. Due to the architecture of their software, data mobility requirements do not exist and Nutanix will handle the movement of data and waterfalling among the tiers of storage transparently to VMware and the administrators.
In the event of a system failure, the guest VMs are vMotioned to a different node and the storage controller (resident to the new node) takes ownership of the VMDK files. During this transition, only the "hot data" will move to speed up the process. Due to the fact that VMware only sees one large storage pool, the backend functionality is handled almost entirely by Nutanix.
As stated above, we will see other hypervisors in the future, maybe even a software solution from Nutanix, but don’t count on it any time soon. They do have a fundamental belief that no storage controller should ever live in a bare metal box (which is exactly why theirs doesn’t). The issue is that it’s hard to certify and support a complicated and proprietary system that is so engrained into the hardware like theirs on generic hardware.
This company has got the ideas and talent to make a lot happen. We will be seeing a lot from them in the coming years, not to say that they aren’t strong now. The solution delivered by Nutanix is unique in the fact that they haven’t cobbled together an offering from an unholy union between various vendors. Theirs is a highly engineered feat of near perfection.
This company decided early on to invest heavily in the people that make the product, incorporating in September of 2009 but not hiring any VPs until late in 2011. It’s no doubt they will continue growing at an astronomical rate and soon become a name synonymous with consolidation and convergence within the datacenter. I wondered at first why their booth at VMworld was standing room only, but it didn’t take long to put it all together.