A Case Study: DR procedures need to be regularly tested

by Carlos Escapa — This past week ReliableDR was installed at two customers with very similar configurations who are worlds apart – one in California and one in South Africa. The customers have active/active data centers with VNX storage replicated with RecoverPoint, and were looking to automate DR testing.

It so happened that both customers installed ReliableDR the same day. The one in South Africa reported the problem first – snapshots could not be mounted on ESXi. Some ten hours later, the one in California reported exactly the same problem.

The storage managers in both cases had the same reaction: to diagnose the problem, they tried to take snapshots manually and mount them. And they failed. A quick call to our support hotline revealed that their target (DR) arrays do not support VMware’s VAAI hardware-assisted locking. In layman’s terms, they could not mount a snapshot in DR. This amounted to a considerable exposure in their DR plan, and of course the impossibility of conducting a DR test.

The good news is that we advised them of a workaround that involves disabling hardware acceleration locking in vSphere. But the lesson to be learned is that no matter how sophisticated and robust the infrastructure appears to be, if DR procedures are not tested, it is nearly impossible to know if they will work when they are truly needed. A failover under duress is the worst possible moment to diagnose infrastructure mis-configuration.

Posted in Blog Tagged with: , , ,

Leave a Reply