[prev in list] [next in list] [prev in thread] [next in thread]
List: toasters
Subject: RE: SMVI / VMWare Experiences...
From: Ken Williams <kwillia () smud ! org>
Date: 2009-09-14 16:51:21
Message-ID: 8A98218D44C9EF4D851C95427928DDAB04850F21 () snpexch01 ! corporate ! smud ! org
[Download RAW message or body]
Sounds like whatever user-defined script you have is failing sometimes?
Or perhaps it's a VMWare tools issue.
We've been able to track our issue down to the Guest OS level (win2k3
specifically). Looks like its an issue with VSS or LUN alignment.
I would recommend ensuring your LUNs are aligned (use the VMWare host
util kit, mbrscan / mbralign). There is detailed documentation on the
NOW.netapp.com site.
-----Original Message-----
From: Nick Silkey [mailto:nick@silkey.org]
Sent: Friday, September 11, 2009 7:15 PM
To: Ken Williams
Cc: toasters@mathworks.com
Subject: Re: SMVI / VMWare Experiences...
Ken --
We too are experiencing issues with SMVI 1.2 bombing out when attempting
to perform a VMware quiesce snap on _some_ RHEL5.3 32-bit VMs. A couple
of notables:
- These problematic VMs have a 100% success rate at taking VMware
quiesce snaps within vCenter, independent of SMVI.
- The problem is 100% reproducible during night, day, etc.
- We will deploy several VMs at a crack, all the same build. When the
next SMVI schedule hits, some fail while others succeed. Bizarre.
- Over time (weve been experiencing this issue for several weeks now),
the 'problem' VMs change. Example: VMs abc and xyz will fail for days;
without intervention, VM abc will stop failing while VM xyz continues to
fail ... even if theyre part of the same deploy base template/kickstart.
- We are nowhere near our snap limit on the volumes.
- These problematic VMs only bomb when attempting a quiesce.
Non-quiesce SMVI snaps work like a champ.
Been working with NetApp and VMware for some time now. Were at ESX
3.5u4+ to an 3160-R5 @ 7.2.6.1P3 via NFS + vCenter 4.0 + synch
SnapMirror to another 3160-R5 @ 7.2.6.1P3. The only thing revealing is
SMVI + vCenter logs of "cannot create a quiesced snapshot because the
(user-supplied) custom prefreeze script in the virtual machine exited
with a nonzero return code".
--
Nick
On Wed, Aug 26, 2009 at 5:32 PM, Ken Williams <kwillia@smud.org> wrote:
> I'm looking for some experiences people out there may have with SMVI
> with NetApp. We're currently experiencing major issues with SMVI
> snapshots failing. I've had open tickets with NetApp/VMWare/Microsoft
> for 3 months and still have yet to have a solution.
>
> My environment looks like such:
>
> 6 x HP DL380 G5 (32gb Ram) in a ESX Cluster Dual Emulex 10000 Cards in
> each host.
> Cisco MDS SAN
> Netapp FAS3070 Cluster ~9tb aggregate for VMWare.
> VMFS Datastores ~10-15 VMs per datastore. ~50gb per VM.
> ASIS Turned on
> Volume and LUNspace reservation turned off OnTap 7.2.5.1 Windows 2003
> Guest OS.
>
> I cant see us reaching any limitation on the Filers or the SAN. Yet we
> have random VMs failing snapshots every night. Are other people seeing
> these issues? (I've gone through the gamut of troubleshooting, version
> management of ESX/VMWareTools/etc). Snapshots timeout and fail at the
> VMWare/Guest level, not at the Netapp snapshot level.
>
> We want to have SMVI function with VSS enabled.
>
> Has anyone had failing snapshots been able to resolve a similar issue?
> Or does anyone have SMVI working properly that we could use as a
> reference to compare configuration?
>
> __________________________________________________________
> Ken Williams
> Storage Administrator, Business Technology Operations Sacramento
> Municipal Utility District
> E-Mail: kwillia@smud.org
> Phone: (916) 732-6744
> Cell: (916) 240-4213
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic