Latest Posts



Translate

Total Pageviews

Tuesday, 26 November 2013

The Management Network vSwitch is deleted on the ESXi host (1010992)

Purpose

This article provides steps for troubleshooting a situation where the Management Network vSwitch is deleted on the ESXi host.

Resolution

You must restore the network settings to the default settings.

Note: This procedure requires you to re-register virtual machines, recreate VMkernel ports, and vSwitches.
 
To restore your network work settings:
  1. Use the Direct Connect UI (DCUI) to connect to ESXi host.
  2. Click Reset System Configuration.
  3. Reboot the ESXi host.
  4. Enter the networking information. For more information, see Configuring the ESXi Management Network from the direct console (1006710).
  5. Do a test ping using the DCUI. If successful, you can access the ESXi host using the VI Client.
  6. Re-register virtual machines and recreate your vSwitches. For more information see, the ESXi documentation.
Source:-

Friday, 22 November 2013

Best practices for joining vCenter Servers in Linked Mode (2005481)

Purpose

This article provides best practices when working with vCenter Server Linked Mode, as well as steps to troubleshoot vCenter Server Linked Mode issues.

Resolution

Best practices

 
When working with vCenter Server Linked Mode issue, follow these best practices:
  • If the vCenter Server is joined to a domain, ensure that it can communicate with the Domain Controller. If Domain Controller communication problems exist, remove and add the vCenter Server to the Windows domain.
  • Ensure that all vCenter Server system times are synchronized with a time difference of no greater than 5 minutes.
  • Ensure that all vCenter Server are the same version and build. For more information, see Cannot access instances of vCenter Server in Linked Mode configuration after upgrading to vCenter Server 4.1 (1026346).
  • Ensure that the VirtualCenter Server Service uses an account with rights to logon as a service/batch job.
  • The vCenter Server Linked Mode Configuration tool must be run by a domain user that is also a local administrator on both machines where vCenter Server is installed.
  • Different Windows domains for vCenter Servers are permitted only if there is a two-way trust between the two domains. Ensure this is true from both Windows domains.
  • If User Account Control (UAC) is enabled, be sure to use Run as administrator when starting the vCenter Server Linked Mode Configuration tool.
  • Ensure that the vCenter Server Windows machine name matches the Domain/DNS name.

    Note: Instancename, VimWebServicesUrl, and VimApiUrl keys must match. For more information, see ESX and vCenter Server Installation Guide.
  • Ensure that the Windows firewall service is running but the firewall is turned off.

Verifying the initial replication

The Jointool/vCenter Server installer does a large set of checks to validate initial replication between instances. Issues with joining two instances are usually due to errors in initial replication. However, after a successful join (especially with more than two total instances in the vCenter Server linked mode group), some instances may not see all instances in the group.
 
To see if ADAM replication is the issue, perform these steps on all concerned vCenter Server machines:
  1. Click Start > Administrative Tools > ADSI Edit.
  2. Right-click ADSI Edit in the left pane and click Connect to.
  3. Under Connection Point in the Distinguished Name box, enter dc=virtualcenter,dc=vmware,dc=int
  4. Under Computer in the domain or server box, enter localhost:389, then click OK. This opens up a new connection to our application partition in ADAM.
  5. Expand Default naming context and drill down clicking the OU=Instances container on the left pane. You see entries (GUIDs) under OU=Instances for the vCenter Servers in your setup.

    This list should be identical on every replica (and the primary). It does guarantee that replication will continue to succeed,  but it does indicate that initial replication during installation was successful.

Verifying the Health service status

To verify the Health service status for the LDAP Replication Monitor, install the service-monitoring vSphere Client plugin as part of all vCenter Server installs:
  1. In vSphere Client, click Home.
  2. Click vCenter Server Service Status in the Administration section.

    Note: If you do not see vCenter Service Status, you have to enable the plugin by clicking Plug-ins > Manage plug-ins.

Troubleshooting replication issues

To troubleshoot replication issues:
  1. Click Start > Administrative Tools > Event Viewer.
    • Review the  Event Viewer Log entries for related ADAM instance (VMwareVCMSDS or something similar) events. Record any warning or error messages you find.
    • Example warning messages involving replication are often explicit. For example:

      8453 Replication access was denied.1772 The list of RPC servers available for the binding of auto handles has been exhausted.
      Note: This error is often a symptom of firewalls blocking ports (RPC mapper runs on port 135, and needs ports > 1024 to be open on the machine).
  2. Run Knowledge Consistency Check (KCC) from the command line to confirm replication is the problem. Run KCC on the replica machine:

    • C:\Windows\ADAM\repadmin.exe /kcc localhost:389 (to confirm local consistency)
    • C:\Windows\ADAM\repadmin.exe /kcc remoteVCFQDNremotePort (to confirm remote primary consistency)

      If either of these return an error, inform VMware if you open a Support Request.
  3. Forcing replication can help diagnose issues. To force replication between ADAM instances:

    C:\WINDOWS\ADAM>repadmin /replicate remote-vc:remote-vc-adam-port local-vc-fqdn:local-adam-port dc=virtualcenter,dc=vmware,dc=int

    This is an example of successful replication:

    C:\WINDOWS\ADAM>repadmin /replicate vm08.PDPVC.com:389 vm04.PDPVC.com:389 dc=virtualcenter,dc=vmware,dc=int Positive response:
    Sync from vm04.PDPVC.com:389 to vm08.PDPVC.com:389 completed successfully.

    This is an example of failed replication:
    DsBindWithCred to vm04.pdpvc.com failed with status 1753 (0x6d9):There are no more endpoints available from the endpoint mapper 
  4. To verify inbound and outbound replication from one machine, run the command:

    repadmin /syncall localhost:vc-ldap-port 
  5. Run directory service tests with dcdiag. This runs a comprehensive list of tests to help diagnose what may have failed with the replication (such as name resolution and or referrals):

    (c:\windows\adam or c:\windows\system32) ddiag /s:localhost:vc-ldap-port
Source:-

Tuesday, 19 November 2013

vMA User Account Privileges

Account Privileges for vCLI Usage lists the privileges that the different user accounts have for vCLI usage against different targets.
Y
Y
N
Y
N
Y
Y
N
N
Y
N
Y

Enable the vi-user Account

Important The vi-user account has limited privileges on the target ESXi hosts and cannot run any commands that require sudo execution. You cannot use vi-user to run commands for Active Directory targets (ESXi or vCenter Server). To run commands for the Active Directory targets, use the vi-admin user or log in as an Active Directory user to vMA.
1
2
Run the Linux passwd command for vi-user as follows:
If this is the first time you use sudo on vMA, a message about root user privileges appears, and you are prompted for the vi-admin password.
When a user is logged in to vMA as vi-user, vMA uses that account on target ESXi hosts, and the user can run only commands on target ESXi hosts that do not require administrative privileges.
Thanks to Vmware Documentation

Sunday, 17 November 2013

Metrics and Thresholds

DisplayMetricThresholdExplanation
CPU%RDY10Overprovisioning of vCPUs, excessive usage of vSMP or a limit(check %MLMTD) has been set. See Jason’s explanation for vSMP VMs
CPU%CSTP3Excessive usage of vSMP. Decrease amount of vCPUs for this particular VM. This should lead to increased scheduling opportunities.
CPU%SYS20The percentage of time spent by system services on behalf of the world. Most likely caused by high IO VM. Check other metrics and VM for possible root cause
CPU%MLMTD0The percentage of time the vCPU was ready to run but deliberately wasn’t scheduled because that would violate the “CPU limit” settings. If larger than 0 the world is being throttled due to the limit on CPU.
CPU%SWPWT5VM waiting on swapped pages to be read from disk. Possible cause: Memory overcommitment.
MEMMCTLSZ1If larger than 0 host is forcing VMs to inflate balloon driver to reclaim memory as host is overcommited.
MEMSWCUR1If larger than 0 host has swapped memory pages in the past. Possible cause: Overcommitment.
MEMSWR/s1If larger than 0 host is actively reading from swap(vswp). Possible cause: Excessive memory overcommitment.
MEMSWW/s1If larger than 0 host is actively writing to swap(vswp). Possible cause: Excessive memory overcommitment.
MEMCACHEUSD0If larger than 0 host has compressed memory. Possible cause: Memory overcommitment.
MEMZIP/s0If larger than 0 host is actively compressing memory. Possible cause: Memory overcommitment.
MEMUNZIP/s0If larger than 0 host has accessing compressed memory. Possible cause: Previously host was overcommited on memory.
MEMN%L80If less than 80 VM experiences poor NUMA locality. If a VM has a memory size greater than the amount of memory local to each processor, the ESX scheduler does not attempt to use NUMA optimizations for that VM and “remotely” uses memory via “interconnect”. Check “GST_ND(X)” to find out which NUMA nodes are used.
NETWORK%DRPTX1Dropped packets transmitted, hardware overworked. Possible cause: very high network utilization
NETWORK%DRPRX1Dropped packets received, hardware overworked. Possible cause: very high network utilization
DISKGAVG25Look at “DAVG” and “KAVG” as the sum of both is GAVG.
DISKDAVG25Disk latency most likely to be caused by array.
DISKKAVG2Disk latency caused by the VMkernel, high KAVG usually means queuing. Check “QUED”.
DISKQUED1Queue maxed out. Possibly queue depth set to low. Check with array vendor for optimal queue depth value.
DISKABRTS/s1Aborts issued by guest(VM) because storage is not responding. For Windows VMs this happens after 60 seconds by default. Can be caused for instance when paths failed or array is not accepting any IO for whatever reason.
DISKRESETS/s1The number of commands reset per second.
DISKCONS/s20SCSI Reservation Conflicts per second. If many SCSI Reservation Conflicts occur performance could be degraded due to the lock on the VMFS.

Check the Default Server Name with whom you are connected in PowerCLI

Command to Check this in Power CLI is:-
$global:DefaultVIServers | %{$_.Name}


Saturday, 16 November 2013

Advanced Memory Attributes

Advanced Memory Attributes
Attribute
Description
Default
Mem.SamplePeriod
Specifies the periodic time interval, measured in seconds of the virtual machine’s execution time, over which memory activity is monitored to estimate working set sizes.
60
Mem.BalancePeriod
Specifies the periodic time interval, in seconds, for automatic memory reallocations. Significant changes in the amount of free memory also trigger reallocations.
15
Mem.IdleTax
Specifies the idle memory tax rate, as a percentage. This tax effectively charges virtual machines more for idle memory than for memory they are actively using. A tax rate of 0 percent defines an allocation policy that ignores working sets and allocates memory strictly based on shares. A high tax rate results in an allocation policy that allows idle memory to be reallocated away from virtual machines that are unproductively hoarding it.
75
Mem.ShareScanGHz
Specifies the maximum amount of memory pages to scan (per second) for page sharing opportunities for each GHz of available host CPU resource. For example, defaults to 4 MB/sec per 1 GHz.
4
Mem.ShareScanTime
Specifies the time, in minutes, within which an entire virtual machine is scanned for page sharing opportunities. Defaults to 60 minutes.
60
Mem.CtlMaxPercent
Limits the maximum amount of memory reclaimed from any virtual machine using the memory balloon driver (vmmemctl), based on a percentage of its configured memory size. Specify 0 to disable reclamation for all virtual machines.
65
Mem.AllocGuestLargePage
Enables backing of guest large pages with host large pages. Reduces TLB misses and improves performance in server workloads that use guest large pages. 0=disable.
1
Mem.AllocUsePSharePool
and
Mem.AllocUseGuestPool
Reduces memory fragmentation by improving the probability of backing guest large pages with host large pages. If host memory is fragmented, the availability of host large pages is reduced. 0 = disable.
15
Mem.MemZipEnable
Enables memory compression for the host. 0 = disable.
1
Mem.MemZipMaxPct
Specifies the maximum size of the compression cache in terms of the maximum percentage of each virtual machine's memory that can be stored as compressed memory.
10
LPage.LPageDefragEnable
Enables large page defragmentation. 0 = disable.
1
LPage.LPageDefragRateVM
Maximum number of large page defragmentation attempts per second per virtual machine. Accepted values range from 1 to 1024.
32
LPage.LPageDefragRateTotal
Maximum number of large page defragmentation attempts per second. Accepted values range from 1 to 10240.
256
LPage.LPageAlwaysTryForNPT
Try to allocate large pages for nested page tables (called 'RVI' by AMD or 'EPT' by Intel). If you enable this option, all guest memory is backed with large pages in machines that use nested page tables (for example, AMD Barcelona). If NPT is not available, only some portion of guest memory is backed with large pages. 0= disable.
1

vNUMA is disabled if VCPU hotplug is enabled (2040375)

Details

If virtual NUMA is configured with VCPU hotplug settings, the virtual machine will be started without virtual NUMA and instead it will use Uniform Memory Access with interleaved memory access. The virtual machine log displays the message:
 
vmware.log> vmx| W110: NUMA and VCPU hot add are incompatible. Forcing UMA

Solution

None. If you do not plan to use VCPU hotplug, do not enable it. Add the maximum VCPUs that might be needed by the workload.
Source:-

NPIV Capabilities and Limitations

NPIV supports vMotion. When you use vMotion to migrate a virtual machine it retains the assigned WWN.
If you migrate an NPIV-enabled virtual machine to a host that does not support NPIV, VMkernel reverts to using a physical HBA to route the I/O.
If your FC SAN environment supports concurrent I/O on the disks from an active-active array, the concurrent I/O to two different NPIV ports is also supported.
When you use ESXi with NPIV, the following limitations apply:
Because the NPIV technology is an extension to the FC protocol, it requires an FC switch and does not work on the direct attached FC disks.
When you clone a virtual machine or template with a WWN assigned to it, the clones do not retain the WWN.
NPIV does not support Storage vMotion.
Disabling and then re-enabling the NPIV capability on an FC switch while virtual machines are running can cause an FC link to fail and I/O to stop.