Vsphere

Is Nutanix the perfect fit for SMBs?

There’s a world below clouds and enterprise environments with thousands of VMs and hundered or thousands of hosts. A world that consists of maximal three hosts. I’m working with quite a few customers, that are using VMware vSphere Essentials Plus. Those environments consist typically of two or three hosts and something between 10 and 100 VMs. Just to mention it: I don’t have any VMware vSphere Essentials customer. I can’t see any benefit for buying these license. Most of these environments are designed for a lifeime of three to four years. After that time, I come again and replace it with new gear. I can’t remember any customer that upgraded his VMware vSphere Essentials Plus. Even if the demands to the IT infrastructure increases, the license stays the same. The hosts and storage gets bigger, but the requirements stays the same: HA, vMotion, sometimes vSphere Replication, often (vSphere API for) Data Protection. Maybe this is a german thing and customers outside of german are growing faster and invest more in their IT.

Tiering? Caching? Why it's important to differ between them.

Some days ago I talked to a colleague from our sales team and we discussed different solutions for a customer. I will spare you the details, but we discussed different solutions and we came across PernixData FVP, HP 3PAR Adaptive OptimizationHP 3PAR Adaptive Flash Cache and DataCore SANsymphony-V. And then the question of all questions came up: “What is the difference?”.

Simplify, then add Lightness

Lets talk about tiering. To make it simple: Tiering moves a block from one tier to another, depending on how often a block is accessed in a specific time. A tier is a class of storage with specific characteristics, for example ultra-fast flash, enterprise-grade SAS drives or even nearline drives. Characteristics can be the drive type, the used RAID level or a combination of characteristics. A 3-tier storage design can consist of only one drive type, but they can be organized in different RAID levels. Tier 1 can be RAID 1 and tier 3 can be RAID 6, but all tiers use enterprise-grade 15k SAS drives. But you can also mix drive types and RAID levels, for example tier 1 with flash, tier 2 with 15k SAS in a RAID 5 and tier 3 with SAS-NL and RAID 6. Each time a block is accessed, the block “heats up”. If it’s hot enough, it is moved one tier up. If it’s less often accessed, the block “cools down” and at a specific point, the block is moved a tier down. If a tier is full, colder blocks will to be moved down and hotter block have to be moved up. It’s a bit simplified, but products like DataCore SANsymphony-V with Auto-Tiering or HP 3PAR Adaptive Optimization are working this way.

What to consider when implementing HP 3PAR with iSCSI in VMware environments

Some days ago a colleague and I implemented a small 3-node VMware vSphere Essentials Plus cluster with a HP 3PAR StoreServ 7200c. Costs are always a sore point in SMB environments, so it should not surprise that we used iSCSI in this design. I had some doubt about using iSCSI with a HP 3PAR StoreServ, mostly because of the performance and complexity. IMHO iSCSI is more complex to implement then Fibre Channel (FC). But in this case I had to deal with it.

My first impressions about PernixData FVP 2.5

On February 25, 2015 PernixData released the latest version of PernixData FVP. Even if it’s only a .5 release, FVP 2.5 adds some really cool features and improvements. New features are:

  • Distributed Fault Tolerant Memory-Z (DFTM-Z)
  • Intelligent I/O profiling
  • Role-based access control (RBAC), and
  • Network acceleration for NFS datastores

Distributed Fault Tolerant Memory-Z (DFTM-Z)

FVP 2.0 introduced support for server side memory as an acceleration resources. With this it was possible to use server side memroy to accelerate VM I/O operations. Server side memory is faster then flash, but also more expensive. With FVP 2.5, the support for adaptive memory compression. was added. DFTM-Z provides a more efficient use of the expensive resource “server side memory”.  Some of you may think “Oh no, compression! This will only cost performance!”. I don’t think that this is fair. ;) The PernixData engineers are focused on performance and I think that they haven’t during the development of DFTM-Z. DFTM-Z is enabled on hosts that use at least 20 GB memory for FVP. With increasing memory used for FVP, the area used for compression in the memory is also increased. So not the whole memory area used for acceleration is compressed, it’s only a part of it. With 20 GB contributing the FVP cluster, the compressed memory region is 4 GB. With more than 160 GB, the region is increased to 32 GB.

vCenter Server Appliance: Troubleshooting full database partition

A customer of mine had within 6 months twice a full database partition on a VMware vCenter Server Appliance. After the first outage, the customer increased the size of the partition which is mounted to /storage/db. Some months later, some days ago, the vCSA became unresponsive again. Again because of a filled up database partition. The customer increased the size of the database partition again  (~ 200 GB!!) and today I had time to take a look at this nasty vCSA.

How to shrink thin-provisioned disks

Disk space is rare. I only have about 1 TB of SSD storage in my lab and I don’t like to waste too much of it. My hosts use NFS to connect to my Synology NAS, and even if I use the VAAI-NAS plugin, I use thin-provisioned disks only. Thin-provisioned disks tend to grow over time. If you copy a 1 GB file into a VM and you delete this file immediately, you will find that the VMDK is increased by 1 GB. This is caused by the guest filesystem. It marks the blocks of deleted files as free, even if it only deletes metadata and not the data itself. Later, the data is overwritten with new data, since the blocks are marked as free and the new data is written in there. VMware ESXi doesn’t know that the guest has marked blocks as free. So ESXi can’t shrink the thin-provisioned VMDK.

VMware ESXi 5.5.0 U2 patches break Citrix NetScaler network connectivity

This is not a brand new issue and it’s well discussed in the VMTN. After applying the ESXi 5.5.0 U2 patches from 15. October 2014, you may notice the following symptoms:

  • Some Citrix NetScaler VMs with e1000 vNICs loses network connectivity
  • You can’t access the VM console after applying the patches

VMware has released a couple of patches in October:

  • ESXi550-201410101-SG (esx-base)
  • ESXi550-201410401-BG (esx-base)
  • ESXi550-201410402-BG (misc-drivers)
  • ESXi550-201410403-BG (sata-ahci)
  • ESXi550-201410404-BG (xhci-xhci)
  • ESXi550-201410405-BG (tools-light)
  • ESXi550-201410406-BG (net-vmxnet3)

More specifically, it’s the patch ESXi550-201410401-BG that is causing the problem. It is reported that the patch ESXi510-201410401-BG is also cause problems. VMware has published a KB article under the the KB2092809. Citrix has also published a KB article under the ID CTX200278. The VMware KB2092809 includes a workaround. You have to add the line

Trouble due to changed vDS default security policy

A customer contacted me, because he had trouble to move a VM between two clusters. The hosts in the source cluster used vNetwork Standard Switches (vSS), the hosts in the destination cluster vNetwork Distributed Switch (dVS). Because of this, a host in the destionation cluster had an additional vSS with the same port groups, that were used in the source cluster. This configuration allowed the customer to do vMotion without shared storage between the two clusters. The setup worked fine, until the customer moved a specific VM to the new cluster and switched the port group of the VM from the vSS to the vDS: The VM lost the connect to the network. A switch back to the vSS restored network connectivity for the VM. While troubleshooting this issue I noticed that the port was blocked due to a L2 security violation.

VMware jumps on the fast moving hyper-converged train

The whole story began with a tweet and a picture:

This picture  in combination with rumors about Project Mystic have motivated Christian Mohn to publish an interesting blog post. Today, two and a half months later, “Marvin” or project Mystic got its final name: EVO:RAIL.

What is EVO:RAIL?

Firstly, we have to learn a new acronym: Hyper-Converged Infrastructure Appliance (HCIA). EVO:RAIL will be exactly this: A HCIA. IMHO EVO:RAIL is VMwares try to jump on the fast moving hyper-converged train. EVO:RAIL combines different VMware products (vSphere Enterprise Plus, vCenter Server, Virtual SAN and vCenter Log Insight) along with EVO:RAIL deployment, configuration and management to a hyper-converged infrastructure appliance. Appliance? Yes, an appliance. A single stock keeping unit (SKU) including hardware, software and support. To be honest: VMware will no try to sell hardware. The hardware will be provided by partners (currently Dell, EMC, Fujitsu, Inspur, NetOne and SuperMicro).

Patch available: VMware vSphere 5.5 U1 NFS APD bug

In April 2014 was a bug in vSphere 5.5 U1 discovered, which can lead to APD events with NFS datastores.iSCSI, FC or FCoE aren’t affected by this bug, but potentially every NFS installation running vSphere 5.5 U1 was at risk. This bug is described in KB2076392. Luckily none of my customers ran into this bug, but this is more due to the fact, that most of my customers use FC/ FCoE or iSCSI. Until today the only solution was to avoid the upgrade to U1 and to use vSphere 5.5 GA (with some patches to fix the Heartbleed bug).