Configuration Considerations (ESX)
Configuration Considerations
Hardware
CPU
Feature | Set to | Intel name | AMD name |
---|---|---|---|
Node Interleaving | Disabled (allows NUMA operation) | ||
Execute Protection | Enabled | eXecute Disable (XD) | No-Execute Page-Protection |
Virtualisation assist | Enabled | Intel VT | AMD-V |
CPU Power vs Performance
If in doubt put server BIOS settings to maximum performance - this ensures that ESX can get the most out of the hardware, allowing the BIOS to balance or use low power modes may impact VM performance. ESX's are expected to work hard, that's how they save you money, and so they should be set-up to be able to perform. In theory, allowing the motherboard to throttle back the CPUs when under low load shouldn't cause a problem.
When using ESX4.1 or higher then set the BIOS to allow the OS (ie ESX) control of CPU performance (if the setting is available), this allows the CPU Performance to be controlled dynamically by ESX as it manages VM load (and configurable through the VI Client).
See VM KB 1018206 - Poor virtual machine application performance may be caused by processor power management settings for further info
HP ASR
Should be disabled.
VMware don’t recommend that we use the HP ASR feature (designed to restart a server in the case of an OS hang), they’ve come across occasions when an ESX under load will suddenly restart due to ASR time-outs. See VM KB 1010842 - HP Automatic Server Recovery in a VMware ESX Environment for further info.
Networking
Beacon Probing
Should only be used when there are 3 or more physical NIC's assigned to the vSwitch, uplinked to the network switch.
This is to enable the ESX to be able to properly determine the state of the network during a faulty condition. If there's only two uplinks and the beacon gets lost between the two NIC's, then the ESX can't know which uplink is faulty, just that there is a fault.
See VM KB 1005577 - What is beacon probing? for further info.
Storage
ESX Installation Sizing
See VM KB 1026500 - Recommended disk or LUN sizes for VMware ESX/ESXi installations
SCSI Resets
When accessing centralised storage via SCSI, VMware recommends the following configuration (only the disabling of SCSI Device Resets is a change from the default). These settings are intended to limit the scope of SCSI Resets, and so reduce contention and overlapping of SCSI commands from different hosts accessing the same storage system.
Disk.UseLunReset
set to1
Disk.UseDeviceReset
set to0
Path Selection Policy (PSP)
- Active-Active (AA) - Storage array allows access to to LUN's through all paths simultaneously.
- Active-Passive (AP) - Storage array allows access to to LUN's through one storage processor at a time
- Asymmetric (ALUA) - Storage array prioritises paths available to access a LUN (See http://www.yellow-bricks.com/2009/09/29/whats-that-alua-exactly/)
Policy | For Arrays | Description |
---|---|---|
Most Recently Used (VMW_PSP_MRU) | All (default for AP arrays) | ESX uses whatever path is available, initially defaulting to last used or first detected at start up |
Fixed (VMW_PSP_FIXED) | Active-Active (not for AP) | ESX uses preferred path, unless its not available. Can cause path thrashing with AP arrays |
Fixed AP (VMW_PSP_FIXED_AP) | All (though really for ALUA) | As for Fixed, but the ESX picks the preferred path, and uses path-thrashing avoidance algorithm |
Round Robin (VMW_PSP_RR) | All | ESX uses all available paths (will be limited by AP arrays) |
Round Robin IOPS Load Balancing
The number of IOs that an ESX will use a path for, before switching to an alternate path to balance the load (so IOPS means IO operations in this instance rather than IOs per second as it normally means).
Whether or not to change this is a contentious issue, the out of the box default is 1000. HP state that when using their EVA storage systems you should set IOPS to 1, and some other vendors appear to use IOPS=1 in their own testing (eg EMC). My personal feeling is to never change any setting from the default unless you have good reason to. The more you change, the more you move away from an expected configuration, and the more chance you have of exposing unexpected flaws and bugs, and the less chance a VMware support guy or gal will have of helping your resolve a problem quickly.
I'm not convinced that the results show a significant improvement in performance by changing the value, and where there is, you need to remember that these are isolated tests, increasing the rate of path switching increases ESX CPU usage, so will be at the detriment of other performance metrics. So would I would not change as a default, but if you are experiencing performance problems its worth considering
Some further reading...