Pages

Monday, 13 April 2015

Hardware Assisted CPU & MMU Virtualization


Most processors from both Intel® and AMD include hardware features to assist virtualization and improve performance. These features—hardware-assisted CPU virtualization, MMU virtualization, and I/O MMU virtualization—are described below
NOTE:- For more info about the virtualization techniques refer this doc1 doc2

Source: VMware Documentation

Hardware-Assisted CPU Virtualization (VT-x and AMD-V™)
Hardware-assisted CPU virtualization assistance, called VT-x (in Intel processors) or AMD-V (in AMD processors), automatically traps sensitive events and instructions, eliminating the software overhead of monitoring all supervisory level code for sensitive instructions. In this way, VT-x and AMD-V give the virtual machine monitor (VMM) the option of using either hardware-assisted virtualization (HV) or binary translation (BT). While HV outperform BT for the vast majority of workloads, there are a few workloads where the reverse is true.

NOTE:- For a 64-bit guest operating system to run on an Intel processor, the processor must have hardware-assisted CPU virtualization.

Hardware-Assisted MMU Virtualization (Intel EPT and AMD RVI) 
Hardware-assisted MMU virtualization, called rapid virtualization indexing (RVI) or nested page tables (NPT) in AMD processors and extended page tables (EPT) in Intel processors, addresses the overheads due to memory management unit (MMU) virtualization by providing hardware support to virtualize the MMU. Without hardware-assisted MMU virtualization, the guest operating system maintains guest virtual memory to guest physical memory address mappings in guest page tables, while ESXi maintains “shadow page tables” that directly map guest virtual memory to host physical memory addresses. These shadow page tables are maintained for use by the processor and are kept consistent with the guest page tables. This allows ordinary memory references to execute without additional overhead, since the hardware translation lookaside buffer (TLB) will cache direct guest virtual memory to host physical memory address translations read from the shadow page tables. However, extra work is required to maintain the shadow page tables. Hardware-assisted MMU virtualization allows an additional level of page tables that map guest physical memory to host physical memory addresses, eliminating the need for ESXi to maintain shadow page tables. This reduces memory consumption and speeds up workloads that cause guest operating systems to frequently modify page tables. While hardware-assisted MMU virtualization improves the performance of the vast majority of workloads, it does increase the time required to service a TLB miss, thus potentially reducing the performance of workloads that stress the TLB. However this increased TLB miss cost can usually be overcome by configuring the guest operating system and applications to use large page.

Large Memory Pages for Hypervisor and Guest Operating System
In addition to the usual 4KB memory pages, ESXi also provides 2MB memory pages (commonly referred to as “large pages”). ESXi assigns these 2MB machine memory pages to guest operating systems whenever possible; on systems with hardware-assisted MMU virtualization, ESXi does this even if the guest operating system doesn’t request them (though the full benefit of large pages comes only when the guest operating system and applications use them as well). The use of large pages can significantly reduce TLB misses, improving the performance of most workloads, especially those with large active memory working sets. In addition, large pages can slightly reduce the per-virtual-machine memory space overhead. If an operating system or application can benefit from large pages on a native system, that operating system or application can potentially achieve a similar performance improvement on a virtual machine backed with 2MB machine memory pages. Consult the documentation for your operating system and application to determine how to configure each of them to use large memory pages. Use of large pages can also change page sharing behavior. While ESXi ordinarily uses page sharing regardless of memory demands, it does not share large pages. Therefore with large pages, page sharing might not occur until memory overcommitment is high enough to require the large pages to be broken into small pages. For further information see VMware KB articles 1021095 and 102189.

Configuring ESXi for Hardware-Assisted Virtualization
ESXi supports a variety of hardware-assisted virtualization features. Based on the available processor features and the guest operating system, ESXi chooses from among a variety of virtual machine monitor (VMM) modes. In the vast majority of cases this default behavior provide the best performance; overriding it will often, if anything, reduce performance. If desired, however, the default behavior can be changed, as described below.
NOTE When hardware-assisted MMU virtualization is enabled for a virtual machine we strongly recommend you also—when possible—configure that virtual machine’s guest operating system and applications to make use of large memory pages. When running on a system with hardware-assisted MMU virtualization enabled, ESXi will attempt to use large pages to back the guest’s memory pages even if the guest operating system and applications do not make use of large memory pages. 

ESX Monitor Modes

VMware has supported Intel and AMD's virtualization assist since 2006.  Long before then we were using an all-software approach that we call binary translation (BT).  With the benefit of years of development and optimization, BT outperformed the early versions of hardware assist.  But as hardware assist evolved the use of these new features became more attractive.

Because our support for hardware assist is rich and BT is heavily optimized, the monitor can benefit from using either technology in different situations.  The following tables detail the defaults in ESX 4.0, which can be changed through VM settings if desired.

Monitor Defaults with Intel Processors

VM Configuration
Core-i7 (Nehalem)
45nm Core2 with VT-x
65nm Core2 with VT-x and FlexPriority
65nm Core2 with VT-x and No FlexPriority
P4 with VT-x
EM64T without VT-x
No EM64T
FT enabled
VT-x + SPT
VT-x + SPT
VT-x + SPT
VT-x + SPT
Not runnable
Not runnable
Not runnable
64-bit guests
VT-x + EPT
VT-x + SPT
VT-x + SPT
VT-x + SPT
VT-x + SPT
Not runnable
Not runnable
VMI enabled
BT + SPT
BT + SPT
BT + SPT
BT + SPT
BT + SPT
BT + SPT
BT + SPT
OpenServer, UnixWare, OS/2
VT-x + EPT
VT-x + SPT
VT-x + SPT
VT-x + SPT
VT-x + SPT
BT + SPT
BT + SPT
32-bit Linux and 32-bit FreeBSD
VT-x + EPT
VT-x + SPT
BT + SPT (*)
BT + SPT (*)
BT + SPT (*)
BT + SPT
BT + SPT
32-bit Windows XP, Windows Vista, Windows Server 2003, Windows Server 2008
VT-x + EPT
VT-x + SPT
VT-x + SPT
BT + SPT (*)
BT + SPT (*)
BT + SPT
BT + SPT
Windows 2000, Windows NT, DOS, Windows 95, Windows 98, Netware, 32-bit Solaris
BT + SPT (*)
BT + SPT (*)
BT + SPT (*)
BT + SPT (*)
BT + SPT (*)
BT + SPT
BT + SPT
All other 32-bit guests
VT-x + EPT
VT-x + SPT
VT-x + SPT
VT-x + SPT
VT-x + SPT
BT + SPT
BT + SPT

(*) When we use BT on an Intel system with VT-x capability, we dynamically switch to VT-x if the guest enters long mode.

Monitor Defaults with AMD Processors

Configuration
Barcelona, Phenom, and Newer
AMD64 pre-Barcelona
No AMD64
FT enabled
AMD-V + SPT
Not runnable
Not runnable
64-bit guests
AMD-V + RVI
BT + SPT
Not runnable
VMI enabled
BT + SPT
BT + SPT
BT + SPT
OpenServer, UnixWare, OS/2
AMD-V + RVI
BT + SPT
BT + SPT
32-bit Linux and 32-bit FreeBSD
AMD-V + RVI
BT + SPT
BT + SPT
32-bit Windows XP, Windows Vista, Windows Server 2003, Windows Server 2008
AMD-V + RVI
BT + SPT
BT + SPT
Windows 2000, Windows NT, DOS, Windows 95, Windows 98, Netware, 32-bit Solaris
BT + SPT
BT + SPT
BT + SPT
All other 32-bit guests
AMD-V + RVI
BT + SPT
BT + SPT

Legend

  • VT-x: Intel's virtualization hardware assist.
  • EPT: Extended Page Tables.  Intel's on-board, virtualization-aware memory management unit (MMU).
  • EM64T: Intel's 64-bit extensions to the x86 architecture.
  • SPT: Shadow page tables.  ESX's software memory management unit (i.e., not EPT or RVI.)
  • BT: Binary translation.  ESX's software virtualization capability (i.e., not VT or AMD-V)
  • AMD-V: AMD's virtualization hardware assist.
  • RVI: Rapid Virtualization indexing.  AMD's on-board, virtualization-aware memory management unit (MMU).

 

Source:-
VMware Technical Whitepaper

No comments:

Post a Comment