esx reservation error timeout Brainardsville New York

Address 51 Cedar St, Malone, NY 12953
Phone (518) 779-5688
Website Link
Hours

esx reservation error timeout Brainardsville, New York

This will clear any SCSI-2 reservations on the device. 2012-04-30T02:29:28.773Z cpu18:1055548)<3>lpfc820 0000:04:00.1: 1:(0):0713 SCSI layer issued Device Reset (2, 22) reset status x2002 flush status x2002 2012-04-30T02:29:28.773Z cpu18:1055548)Resv: 618: Executed out-of-band esx-esxhost1.local-2012-04-30-14.41/var/run/log/vmkernel.1:2012-04-29T14:38:42.028Z cpu17:4113)NMP: nmp_PathDetermineFailure:2084: SCSI cmd RESERVE failed on path vmhba1:C0:T0:L18, reservation state on device naa.6005076801890072500000000000066f is unknown. Command may fail 5. Very APD-like in the way that the hosts were affected.

Act:EVAL2012-03-28T17:12:56.481Z cpu11:4107)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237:NMP device "naa.60060480000190100554533033364445" state in doubt; requested fast path state update…2012-03-28T17:12:56.481Z cpu11:4107)ScsiDeviceIO: 2322: Cmd(0x4124003d4740) 0x16, CmdSN 0x23102 from world 0 to dev "naa.60060480000190100554533033364445" failed H:0x8 D:0x0 P:0x0 Possible esx-esxhost4.local-2012-04-30-14.37/var/run/log/vmkernel.1:2012-04-29T14:40:21.634Z cpu17:4113)NMP: nmp_PathDetermineFailure:2084: SCSI cmd RESERVE failed on path vmhba1:C0:T1:L15, reservation state on device naa.600507680189007250000000000007d7 is unknown. Since this is UTC time, this translates to 9:38am to 9:40am, which the time period in which the last SVC reboot occurred and then brought its paths back online (9:38am) to Not only is this a command failure, but we specifically state that the reservation state of a LUN is “unknown”.

Thoughts are our own and may not neccessarily represent the companies we work for. This will typically only happen if there are severe problems trying to communicate with the device, whether the problem is fabric related or the array itself. SVC is not showing anyerrors as well. After some time, esx_host1 was finally able to connect to the array and release the stuck SCSI reservation.

There are no subsequent messages showing that the ESX host has failed back to the original paths, however the reason for this is because there were no preferred paths set by esx-esxhost1.local-2012-04-30-14.41/var/run/log/vmkernel.1:2012-04-29T14:38:42.029Z cpu17:4113)NMP: nmp_PathDetermineFailure:2084: SCSI cmd RESERVE failed on path vmhba1:C0:T0:L15, reservation state on device naa.600507680189007250000000000007d7 is unknown. VMware.com Communities Search Search Home > Blogs > Support Insider Support Insider VMware Support News, Alerts, and Announcements Post navigation ← How to enable the Storage Accelerator in VMware View 5.1 ESX host X’s HBA goes into internal fatal error state and no longer responds to/issue commands, particularly a SCSI release.

The VMWAre ESX logs show numerous reservation errorson the LUNs and a few instances of VMs getting corrupted have occured.Any tuning parameter in the Qlogic HBAs that might help?Anyone out there esx-esxhost4.local-2012-04-30-14.37/var/run/log/vmkernel.1:2012-04-29T14:40:21.634Z cpu17:4113)NMP: nmp_PathDetermineFailure:2084: SCSI cmd RESERVE failed on path vmhba1:C0:T1:L1, reservation state on device naa.600507680189007250000000000006d7 is unknown. For more information on NMP host statuses, See KB 1029039. For example, 4GB HBA Firmware Versions 2.10*, 2.5*, 2.7*, and 2.80* and 2GB HBA Firmware Versions: 1.8*, 1.90*, and 1.91* are outdated.

I found another VMware KB 1021187 that matched the symptoms of both hosts. This behavior eventually leads to the following: 2012-04-29T14:38:17.866Z cpu17:134520)ScsiDeviceIO: 2305: Cmd(0x412440df52c0) 0x16, CmdSN 0x207323 to dev "naa.6005076801890072500000000000065e" failed H:0x2 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0. 2012-04-29T14:38:17.866Z cpu17:134520)NMP: nmp_PathDetermineFailure:2084: SCSI No prior reservation exists on the device. esx-esxhost1.local-2012-04-30-14.41/var/run/log/vmkernel.1:2012-04-29T14:39:22.028Z cpu18:7351)NMP: nmp_PathDetermineFailure:2084: SCSI cmd RESERVE failed on path vmhba1:C0:T0:L14, reservation state on device naa.6005076801890072500000000000083b is unknown.

The above messages shows us a Host status of ‘1' (H:0x1), which translates to NO_CONNECT, and is a valid failover condition. esx-esxhost1.local-2012-04-30-14.41/var/run/log/vmkernel.1:2012-04-29T14:38:02.029Z cpu21:4117)NMP: nmp_PathDetermineFailure:2084: SCSI cmd RESERVE failed on path vmhba1:C0:T0:L39, reservation state on device naa.6005076801890072500000000000080d is unknown. Those messages are described in VMware KB 1029456, and the suggestion is to update the firmware: There is no Sense Key or Addition Sense Code/ASC Qualifier information for this status as All rights reserved.

esx-esxhost4.local-2012-04-30-14.37/var/run/log/vmkernel.1:2012-04-29T14:39:01.633Z cpu15:14989)NMP: nmp_PathDetermineFailure:2084: SCSI cmd RESERVE failed on path vmhba1:C0:T1:L15, reservation state on device naa.600507680189007250000000000007d7 is unknown. esx-esxhost2.local-2012-04-30-14.52/var/run/log/vmkernel.1:2012-04-29T14:38:17.866Z cpu17:134520)NMP: nmp_PathDetermineFailure:2084: SCSI cmd RESERVE failed on path vmhba0:C0:T2:L17, reservation state on device naa.6005076801890072500000000000065e is unknown. The issue was finally resolved by issuing a LUN reset to the affected LUNs: 2012-04-30T02:26:32.197Z cpu23:2689589)WARNING: NMP: nmpDeviceTaskMgmt:2210:Attempt to issue lun reset on device naa.6005076801890072500000000000083b. This can be illustrated with the following entries from the IBM SVC logs: Error Log Entry 960 Node Identifier : SVCN4 Object Type : node Object ID :

Having the LUN reset on the storage array side helped bring hosts back for the most part. This entry was posted in From the Trenches on May 10, 2012 by Rick Blythe. Comments are closed. This week Nathan Small (Twitter handle: vSphereStorage) takes us through the determination of root cause for a SCSI Reservation Conflict issue: History of issue: Customer performed a firmware upgrade to their

Fabric issues where frames are dropped2. esx-esxhost3.local-2012-04-30-09.49/var/run/log/vmkernel.log:2012-04-29T09:40:01.001Z cpu8:4104)NMP: nmp_PathDetermineFailure:2084: SCSI cmd RESERVE failed on path vmhba0:C0:T0:L14, reservation state on device naa.6005076801890072500000000000083b is unknown. Array or array controller is overloadedThere are other messages occurring in the logs that also indicate an array performance issue:2012-03-26T23:00:30.950Z cpu8:4131)FS3Misc: 1440: Long VMFS3 rsv time on ‘VMW02' (held for 1457 Stacy April 22, 2012 at 10:07 am Experienced the SCSI "Reservation error: Timeout" issue yesterday on a cluster of ESX 4.1 hosts.

Looking at the switch logs and/or the array logs, you may find some additional clues.  On another host, I saw the following: Apr 27 10:15:02 esx_host2 vmkernel: 110:04:47:54.368 cpu0:1039)FS3: 4828: Reclaimed Need to determine why these events are being seen and how we can improve performance.The first step when troubleshooting this type of issue is to take a frame of reference that Root cause analysis requested. Sep 4, 2009 5:56 PM in response to: Dan Re: Too many scsi reservation conflicts you should reduce your scsi queue and you will have less scsi reservation errors the following

Nathan Small April 16, 2012 at 10:36 am Hi Anthony, You may seen these messages in your environment from time to time and all it really means is that it took This issue can occur if the affected hosts are using Emulex 2Gb, 4Gb and 8Gb HBA's with old or outdated firmware. Looking over esx_host1 I  also saw a lot of these messages during the time issue: Apr 27 10:25:33 esx_host1 vmkernel: 84:18:52:08.503 cpu0:1043)WARNING: SCSI: 2909: CheckUnitReady on vmhba1:0:149 returned Storage initiator error The firmware update is applied, controller is rebooted, and then brought back online. 16 minutes later we observe LUN resets for all paths/LUN on the controller that was brought down for

Issue was resolved by sending LUN reset (vmkfstools -L lunreset /vmfs/devices/disks/naa.xxxxxxxxxxxxxxxxx) to the affected LUNs. Infact, these Emulex device loss messages even give you the WWPN of the IBM SVC targets: WWPN 50:05:07:68:01:30:59:fd WWPN 50:05:07:68:01:10:59:fd For our example, we will refer to LUN 42 when comparing This will happen when a SCSI RESERVE command is sent, doesn't not complete to the target due to timeout, and then abort sent for the SCSI reserve doesn't complete either due When we attempt to send an abort for that command and the abort does not complete either, we legitimately do not know what the reservation state of the LUN is at

Environment details:IBM Bladecenter with LS20 blades, QLogic QLA2xxx 2gb adaptersBrocade fiber switch moduleConnected to IBM SAN Volume Controller (SVC) ver 4.2.1.5Running SVC Global Mirror to another site10 Nodes running ESX 3.5.0 To start, let's take a closer look at the LUNs reporting the issue in both environments:ESX 4.1:Error message:Mar 28 17:12:58 vmkernel: 34:11:37:08.013 cpu13:4262)NMP: nmp_PathDetermineFailure: SCSI cmd RESERVE failed on path vmhba1:C0:T2:L1, In this environment the PSP (Path Selection Policy) used is FIXED so we see failover events: 2012-04-29T13:18:19.745Z cpu10:6334)NMP: nmp_ThrottleLogForDevice:2318: Cmd 0x2a (0x4124011a3a00) to dev "naa.60050768018900725000000000000810" on path "vmhba0:C0:T2:L42" Failed: H:0x1 D:0x0 esx-esxhost2.local-2012-04-30-14.52/var/run/log/vmkernel.1:2012-04-29T14:39:37.866Z cpu17:2154857)NMP: nmp_PathDetermineFailure:2084: SCSI cmd RESERVE failed on path vmhba0:C0:T2:L1, reservation state on device naa.600507680189007250000000000006d7 is unknown.

The next step would be to collect performance information from the array side from this point forward so that when another event occurs the storage team or storage array vendor can esx-esxhost4.local-2012-04-30-14.37/var/run/log/vmkernel.1:2012-04-29T14:39:41.633Z cpu5:4101)NMP: nmp_PathDetermineFailure:2084: SCSI cmd RESERVE failed on path vmhba1:C0:T1:L14, reservation state on device naa.6005076801890072500000000000083b is unknown. esx-esxhost2.local-2012-04-30-14.52/var/run/log/vmkernel.1:2012-04-29T14:38:18.866Z cpu23:2134223)NMP: nmp_PathDetermineFailure:2084: SCSI cmd RESERVE failed on path vmhba0:C0:T2:L39, reservation state on device naa.6005076801890072500000000000080d is unknown. We have the ‘Long VMFS3' error on multiple hosts attached to multiple different arrays.

Status=SCSI reservation conflict 2012-04-29T14:39:53.082Z cpu17:4278)VMW_SATP_SVC: satp_svc_UpdatePath:213: Failed to update path "vmhba1:C0:T1:L2" state. While the SCSI-2 Reserve was a symptom of the issue, this scenario as well as many other scenarios regarding SCSI Reservation Conflicts can be avoid by utilizing the hardware assisted locking esx-esxhost1.local-2012-04-30-14.41/var/run/log/vmkernel.1:2012-04-29T14:39:22.028Z cpu10:550864)NMP: nmp_PathDetermineFailure:2084: SCSI cmd RESERVE failed on path vmhba1:C0:T0:L17, reservation state on device naa.6005076801890072500000000000065e is unknown. Issue abort for the SCSI command 0x16 4.

This shows us that more than LUN 1, 2, and 14 reported the reservation state unknown errors, though LUNs 1, 2, and 14 were the only LUNs that had a SCSI-2 Issue SCSI command 0x16 (SCSI RESERVE) 2. Search / Translate Subscribe & Follow Subscribe Subscribe to Special Alerts @VMwareCares Tweets by @VMwareCares Resources Knowledge Base KB Digest VMware KBTV Whitepapers Technical Papers Documentation Categories Alerts Announcements Cloud Consumer esx-esxhost2.local-2012-04-30-14.52/var/run/log/vmkernel.1:2012-04-29T14:38:58.866Z cpu23:1675016)NMP: nmp_PathDetermineFailure:2084: SCSI cmd RESERVE failed on path vmhba0:C0:T2:L15, reservation state on device naa.600507680189007250000000000007d7 is unknown.

Consider reseating the HBAs on the server and trying different PCI slots. Some PCI-X slots operate at different bus speeds (for example, 100Mhz vs. 133MHz). It is possible that the HBA is not able esx-esxhost1.local-2012-04-30-14.41/var/run/log/vmkernel.1:2012-04-29T14:38:02.029Z cpu21:4117)NMP: nmp_PathDetermineFailure:2084: SCSI cmd RESERVE failed on path vmhba1:C0:T0:L16, reservation state on device naa.6005076801890072500000000000065d is unknown.