Symptoms
- vMotion fails at 10%
- You see these errors in vCenter Server:
- A general system error occurred: Migration failed while copying data, Broken Pipe
Migration failed while copying data. Connection reset by peer - A general system error occured: Failed to start migration pre-copy Error 0xbad010d. The Esx host failed connect over the VMotion network.
- A general system error occurred: Migration failed while copying data, Broken Pipe
- You see this error in the /var/log/messages (ESXi host) or /var/log/vmkernel (ESX host) log file:
The ESX hosts failed to connect over the VMotion network Module Migrate power on failed - vmkping testing on the VMkernel network used for vMotion is successful
Note: For more information, see VMkernel network connectivity with the vmkping command (1003728). - Disabling and re-enabling vMotion on the VMkernel port used for vMotion in vCenter does not resolve this issue
- In the /var/log/vmkernel log file of the source ESX host, you see this warning:
WARNING: MigrateNet: 210: 4225417790: 2-0x3fa0c8d0:Received only 0 of 68 bytes: Migration protocol error - In the /var/log/vmkernel log file of the destination ESX host, you see these messages:
- WARNING: Migrate: 1153: 4225417790: Failed: I/O error (0xbad000a) @0x8d7c03
- ESX hosts failed to connect over the VMotion network (0xbad010d) @0x0
- Feb 22 14:14:04 esx1 vmkernel: 402:01:22:35.133 cpu6:1939)Migrate: vm 1940: 7338: Setting migration info ts = 1298397818667246, src ip = <192.168.103.48> dest ip = <192.168.103.7> Dest wid = 5272 using SHARED swap
Feb 22 14:14:04 esx1 vmkernel: 402:01:22:35.134 cpu6:1939)World: vm 3330: 900: Starting world migSendHelper-1940 with flags 1
Feb 22 14:14:04 esx1 vmkernel: 402:01:22:35.134 cpu6:1939)World: vm 4355: 900: Starting world migRecvHelper-1940 with flags 1
Feb 22 14:14:04 esx1 vmkernel: 402:01:22:35.136 cpu4:3330)WARNING: MigrateNet: 309: 1298397818667246: 5-0x801f640:Sent only 4020 of 4096 bytes of message data: Broken pipe
Feb 22 14:14:04 esx1 vmkernel: 402:01:22:35.136 cpu4:3330)WARNING: Migrate: 6776: 1298397818667246: Couldn't send data for 8: Broken pipe
Feb 22 14:14:04 esx1 vmkernel: 402:01:22:35.136 cpu4:3330)WARNING: Migrate: 1243: 1298397818667246: Failed: Broken pipe (0xbad0052) @0x9efd5f
- The /var/log/vmware/hostd.log (ESX) and /var/log/messages (ESXi) contains an entry similar to:
Apr 8 11:24:44 Hostd: [2011-04-08 11:24:44.929 37903B90 verbose 'vm:/vmfs/volumes/4c220e6f-01b124b3-f25d-e41f132dae86/twidmann_test/twidmann_test.vmx'] VMotionLastStatusCb: Failed with error 536871181: Failed to start migration pre-copy. Error 0xba
Apr 8 11:24:44 d010d. The ESX hosts failed to connect over the VMotion network.
Apr 8 11:24:44 Hostd: [2011-04-08 11:24:44.929 37903B90 verbose 'vm:/vmfs/volumes/4c220e6f-01b124b3-f25d-e41f132dae86/twidmann_test/twidmann_test.vmx'] VMotionResolveCheck: Operation in progress
Apr 8 11:24:44 Hostd: [2011-04-08 11:24:44.929 37903B90 verbose 'vm:/vmfs/volumes/4c220e6f-01b124b3-f25d-e41f132dae86/twidmann_test/twidmann_test.vmx'] VMotionStatusCb: Completed
Apr 8 11:24:44 Hostd: [2011-04-08 11:24:44.929 37903B90 verbose 'vm:/vmfs/volumes/4c220e6f-01b124b3-f25d-e41f132dae86/twidmann_test/twidmann_test.vmx'] VMotionResolveCheck: Firing ResolveCb
Apr 8 11:24:44 Hostd: [2011-04-08 11:24:44.929 37903B90 info 'VMotionSrc (1302261872027749)'] ResolveCb: VMX reports needsUnregister = false for migrateType MIGRATE_TYPE_VMOTION
Apr 8 11:24:44 Hostd: [2011-04-08 11:24:44.929 37903B90 info 'VMotionSrc (1302261872027749)'] ResolveCb: Failed with fault: (vmodl.fault.SystemError) {
Apr 8 11:24:44 Hostd: dynamicType = <unset>,
Apr 8 11:24:44 Hostd: faultCause = (vmodl.MethodFault) null,
Apr 8 11:24:44 Hostd: reason = "Failed to start migration pre-copy. Error 0xbad010d. The ESX hosts failed to connect over the VMotion network.
Apr 8 11:24:44 Hostd: ",
Apr 8 11:24:44 Hostd: msg = "",
Apr 8 11:24:44 Hostd: }
Apr 8 11:24:44 Hostd: [2011-04-08 11:24:44.929 37903B90 verbose 'VMotionSrc (1302261872027749)'] Migration changed state from MIGRATING to DONE
Apr 8 11:24:44 Hostd: [2011-04-08 11:24:44.929 37903B90 verbose 'VMotionSrc (1302261872027749)'] Finish called
Apr 8 11:24:44 Hostd: [2011-04-08 11:24:44.929 366A9B90 info 'vm:/vmfs/volumes/4c220e6f-01b124b3-f25d-e41f132dae86/twidmann_test/twidmann_test.vmx'] Disconnect check in progress.
Resolution
This is a known issue affecting ESXi/ESX 4.0
This issue is resolved in ESXi/ESX 4.0 Update 2 and is being investigated by VMware for the other affected versions.
To download ESXi/ESX 4.0 Update 2, see Download VMware vSphere.
If you cannot upgrade, or you experience these symptoms in a newer version, you can workaround this issue by resetting theMigrate.Enabled setting on both the source and destination hosts.
Note: This issue may re-occur even after applying the workaround.
To reset the Migrate.Enabled setting:
- Connect vSphere or VMware Infrastructure Client to your vCenter Server.
- Click on the ESX host.
- Click the Configuration tab.
- Click Advanced Settings under Software.
- Select Migrate and change Migrate.Enabled to 0.
- Click OK and close.
- Click on Advanced Settings.
- Select Migrate and change Migrate.Enabled to 1.
- Click OK and then Close.
Note: If you see the invalid parameter error after resetting Migrate.Enabled to 1, see Performing a vMotion or adding a network card to a virtual machine fails with the error: Necessary module isn't loaded. (2013128)
If these steps do not resolve the issue, try increasing the timeout for migration network operations after Step 4 and then continue with the remaining steps. Also, ensure to repeat these steps on the destination host.
To increase the timeout for migration network operations:
- Click the Configuration tab.
- Click Advanced Settings under Software > Migrate.
- Change Migrate.NetTimeout to 60 seconds. The default is 20 seconds.
- Click OK and then Close.
Source:-
http://kb.vmware.com/selfservice/search.do?cmd=displayKC&docType=kc&docTypeID=DT_KB_1_1&externalId=1013150
No comments:
Post a Comment