VCP4: Difference between revisions
m (→Perform Basic Troubleshooting for ESX/ESXi Hosts: Added networking) |
(Added VCP cat and Meta) |
||
(54 intermediate revisions by the same user not shown) | |||
Line 2: | Line 2: | ||
* [http://communities.vmware.com/community/vmtn/certedu/certification/vcp VMware VCP Forum] | * [http://communities.vmware.com/community/vmtn/certedu/certification/vcp VMware VCP Forum] | ||
* [http://mylearn.vmware.com/lcms/mL_faq/2726/VMware%20Certified%20Professional%20on%20vSphere%204%20Blueprint%208.13.09.pdf VCP4 Blueprint] | * [http://mylearn.vmware.com/lcms/mL_faq/2726/VMware%20Certified%20Professional%20on%20vSphere%204%20Blueprint%208.13.09.pdf VCP4 Blueprint] | ||
* [http://www.vmware.com/support/pubs/vs_pages/vsp_pubs_esx40_vc40.html | * VMware vSphere Documentation: [http://www.vmware.com/support/pubs/vs_pages/vsp_pubs_esx40_vc40.html PDF] [http://pubs.vmware.com/vsp40 HTML] (HTML version is good for searching) | ||
* [http://thinkvirtually.co.uk/#/overview/4535842936 Scott Vessey] | |||
* [http://www.simonlong.co.uk/blog/vcp-vsphere-upgrade-study-notes/ Simon Long blog] | * [http://www.simonlong.co.uk/blog/vcp-vsphere-upgrade-study-notes/ Simon Long blog] | ||
Line 26: | Line 27: | ||
** Troubleshoot | ** Troubleshoot | ||
* If no DHCP found during install, default 169.254.0.0 / 16 address assigned | * If no DHCP found during install, default 169.254.0.0 / 16 address assigned | ||
'''Required Partitions''' | |||
{|cellpadding="4" cellspacing="0" border="1" | |||
|- style="background-color:#bbddff;" | |||
! Mount !! Type !! Size !! Description | |||
|- | |||
| <code>/boot</code> || ext3 || 1100MB || Boot disk req 1.25GB free space, includes <code>/boot</code> and <code>vmkcore</code> partitions | |||
|- | |||
| || swap || 600MB || ESX swap, 600MB req, 1.6GB max (use default at install) | |||
|- | |||
| <code> / </code> || ext3 || 5 GB || ESX OS and services, 3rd party apps | |||
|- | |||
| || VMFS || 1200MB || Service Console (esxconsole.vmdk), not ESXi | |||
|- | |||
| || <code> vmkcore </code> || 1.25GB || Core debugging dumps | |||
|} | |||
'''Optional Partitions''' | |||
{|cellpadding="4" cellspacing="0" border="1" | |||
|- style="background-color:#bbddff;" | |||
! Mount !! Type !! Size !! Description | |||
|- | |||
| <code>/home</code> || ext3 || 512MB || ESX user accounts | |||
|- | |||
| <code>/tmp</code> || ext3 || 1024MB || Temp files! | |||
|- | |||
| <code>/usr </code> || ext3 || || User programs and data (3rd party apps) | |||
|- | |||
|<code>/var/log</code>|| ext3 || 2000MB || Log files | |||
|- | |||
|} | |||
'''vSphere Editions''' | '''vSphere Editions''' | ||
Line 38: | Line 72: | ||
| Standard || Essentials + HA | | Standard || Essentials + HA | ||
|- | |- | ||
| Advanced || Standard + 12 cores/CPU, Hot Add, FT, vShield | | Advanced || Standard + 12 cores/CPU, Hot Add, FT, VMotion, vShield, Data Recovery | ||
|- | |- | ||
| Enterprise || Advanced + 6 cores/CPU, Storage vMotion, Data Recovery, DRS | | Enterprise || Advanced + 6 cores/CPU, Storage vMotion, Data Recovery, DRS | ||
|- | |- | ||
| Enterprise Plus || 12 cores/CPU, 8way vSMP, | | Enterprise Plus || 12 cores/CPU, 8way vSMP, 1TB/ESX, vNetwork Distributed Switch, Host Profiles, 3rd Party Multipathing | ||
|- | |- | ||
| vCentre Foundation || Fully featured, but limited to managing 3 ESX's | | vCentre Foundation || Fully featured, but limited to managing 3 ESX's | ||
Line 48: | Line 82: | ||
== Upgrade VMware ESX/ESXi == | == Upgrade VMware ESX/ESXi == | ||
'''Prerequisites''' | |||
* <code> /boot </code> partition must be at least 100 MB | |||
'''Pre-Upgrade Backups''' | '''Pre-Upgrade Backups''' | ||
* Backup ESX Host Config | * Backup ESX Host Config | ||
Line 94: | Line 131: | ||
== Secure VMware ESX/ESXi == | == Secure VMware ESX/ESXi == | ||
* ESX firewall - primary source of protection for Service Console | |||
* Weak ciphers are disabled, all communications are secured by SSL certificates | |||
* Tomcat Web service has been modified to limited functionality (to avoid general Tomcat vulnerabilities) | |||
* Insecure services (eg FTP, Telnet) are not installed, and ports blocked by the firewall | |||
* TCP 443 - Service Console, vmware-authd | |||
* TCP 902 - VMkernel, vmkauthd | |||
== Install VMware ESX/ESXi on SAN Storage == | |||
'''Boot from SAN''' | |||
* HBA must be located in lowest PCI bus and slot number | |||
* HBA BIOS must designate the FC card as a boot controller | |||
* The FC card must initiate a primative connection to the boot LUN | |||
* Each ESX must have its own boot LUN | |||
** SAN storage paths can be masked using <code> esxcli corestorage claimrule </code> (PSA claim) rules to select which available LUN's are claimed | |||
* iSCSI must use a hardware initiator (impossible to boot using software iSCSI) | |||
'''FC boot from SAN set-up''' | |||
* Configure/create boot LUN | |||
* Enable boot from HBA in system's BIOS and in HBA's BIOS | |||
* Select the LUN to boot from in HBA BIOS | |||
'''iSCSI boot from SAN set-up''' | |||
* Configure storage ACL so that only correct ESX has access to correct boot LUN (must be LUN 0 or LUN 255) | |||
* Enable boot from HBA in system's BIOS and in HBA's BIOS | |||
* Configure target to boot from in HBA's BIOS | |||
== Identify vSphere Architecture and Solutions == | == Identify vSphere Architecture and Solutions == | ||
Line 103: | Line 162: | ||
* Server | * Server | ||
* ESXi (standalone, free) | * ESXi (standalone, free) | ||
'''vSphere Features etc''' | |||
* '''VMsafe''' - API to enable 3rd party security products to control and protect | |||
** Memory and CPU - Introspection of VM memory pages and CPU states | |||
** Networking - Filtering of packets inside hypervisor (vSwitches) | |||
** Process Execution - In guest (VM), in process API's effectively allowing monitoring and control of process execution (agent-less AV) | |||
** Storage - VM disks can be mounted etc (agent-less AV) | |||
* '''vShield''' - Appliance utilising VMsafe to provide security and compliance | |||
'''Datacentre Solutions''' | '''Datacentre Solutions''' | ||
Line 136: | Line 203: | ||
* VLAN - Traditional single VLAN assignment to a port group | * VLAN - Traditional single VLAN assignment to a port group | ||
* VLAN Trunking - Multiple VLAN's can be assigned to a dv Port Group | * VLAN Trunking - Multiple VLAN's can be assigned to a dv Port Group | ||
* Private VLAN - Allows Private VLANs ( | * Private VLAN - Allows Private VLANs | ||
** VLANs over a VLAN, the VLAN equivalent of subnetting. Hosts on differing subVLANs may be in same IP range, but need to go via router to communicate. | |||
** Primary (promiscuous) VLAN uplinks to rest of network | |||
** Ssee http://blog.internetworkexpert.com/2008/07/14/private-vlans-revisited/ | |||
'''Traffic Shaping''' | |||
* Can be applied to both inbound and outbound traffic | |||
* Can be set per dvPort (dvPort Group must allow overrides) | |||
'''Service Console ports''' | '''Service Console ports''' | ||
Line 148: | Line 222: | ||
= Configure ESX/ESXi Storage = | = Configure ESX/ESXi Storage = | ||
== Configure FC SAN Storage == | == Configure FC SAN Storage == | ||
'''Storage Device Naming''' | |||
* '''Name''' - A ''friendly'' name based on storage type and manufacturer. User changeable, kept consistent across ESX's | |||
* '''Identifier''' - Globally unique, human unintelligible. Persistent through reboot and consistent across ESX's | |||
* '''Runtime Name''' - The first path to a device, created by host and not persistent. Of format '''<code>vmhba#:C#:T#:L#''' </code> | |||
** vmhba - Storage Adapter number | |||
** C - Storage Channel number (software iSCSI uses this to represent multiple paths to same target) | |||
** T - Target | |||
** L - LUN (provided by storage system; if only 1 LUN its always L0) | |||
'''PSA - Pluggable Storage Architecture''' | '''PSA - Pluggable Storage Architecture''' | ||
* Manages storage multipathing | * Manages storage multipathing | ||
Line 153: | Line 237: | ||
* Native Multipathing Plugin (NMP) provided by default, can have sub-plugins (can be either VMware or 3rd party) | * Native Multipathing Plugin (NMP) provided by default, can have sub-plugins (can be either VMware or 3rd party) | ||
** Storage Array Type Plugin (SATP) - unique to a particular array (effectively an array driver, like a standard PC hardware driver) | ** Storage Array Type Plugin (SATP) - unique to a particular array (effectively an array driver, like a standard PC hardware driver) | ||
** Path Selection Plugin (PSP) | ** Path Selection Plugin (PSP) - default assigned by NMP based on the SATP | ||
* Multipathing Plugin (MPP) - 3rd party, can run alongside or in addition to Native Multipathing Plugin | * Multipathing Plugin (MPP) - 3rd party, can run alongside or in addition to Native Multipathing Plugin, | ||
'''PSA operations''' | '''PSA operations''' | ||
Line 165: | Line 250: | ||
* Handles physical path discovery and removal | * Handles physical path discovery and removal | ||
* Provides logical device and physical path I/O stats | * Provides logical device and physical path I/O stats | ||
'''MPP / NMP operations''' | '''MPP / NMP operations''' | ||
Line 174: | Line 260: | ||
** Depending on storage device, perform specific actions necessary to handle path failures and I/O cmd retries | ** Depending on storage device, perform specific actions necessary to handle path failures and I/O cmd retries | ||
* Support management tasks, EG abort or reset of logical devices | * Support management tasks, EG abort or reset of logical devices | ||
'''PSP types''' | |||
Default (VMware) PSP Types (3rd party PSP's can be installed)... | |||
* '''Most Recently Used''' - Good for either Active/Active or Active/Passive | |||
* '''Fixed''' - Can cause path thrashing when used with Active/Passive | |||
* '''Round Robin''' - Load balanced | |||
'''PSA Claim Rules''' | '''PSA Claim Rules''' | ||
Used to define paths should be used by a particular plugin module | * Used to define paths should be used by a particular plugin module | ||
'''LUN Masking''' | '''LUN Masking''' | ||
Used to prevent an ESX from seeing LUN's or using individual paths to a LUN | * Used to prevent an ESX from seeing LUN's or using individual paths to a LUN | ||
Add and load a claim rule to apply | * Add and load a claim rule to apply | ||
== Configure iSCSI SAN Storage == | == Configure iSCSI SAN Storage == | ||
'''''Most of the FC SAN Storage info above is also applicable here''''' | |||
'''CHAP Authentication''' | |||
* '''One-way CHAP''' - Unidirectional, iSCSI target authenticates the initiator (ESX) only | |||
* '''Mutual CHAP''' - Bidirectional, ESX also authenticates the iSCSI target (''Software iSCSI only'') | |||
'''Multipathing (software iSCSI)''' | |||
# Set-up a vSwitch with two VMkernel ports and two uplinks | |||
# For each VMkernel port, edit ''NIC Teaming'' | ''Override vSwitch failover order'' to bind one uplink each | |||
# Connect the iSCSI initiator to each VMkernel port | |||
#* <code> esxcli swiscsi nic add -n <vmk_port_name> -d <vmhba_no> </code> | |||
== Configure NFS Datastores == | == Configure NFS Datastores == | ||
* ESX's manage exclusive access to files via <code>.lc-XXX</code> lock files | * ESX supports NFS v3 on TCP ''only'' | ||
* ESX's manage exclusive access to files via <code> .lc-XXX </code> lock files | |||
* To use jumbo frames, enable on the vSwitch and the VMkernel port(s) | |||
** Frames up to 9kB are supported | |||
== Configure and Manage VMFS Datastores == | == Configure and Manage VMFS Datastores == | ||
Line 212: | Line 324: | ||
{|cellpadding="4" cellspacing="0" border="1" | {|cellpadding="4" cellspacing="0" border="1" | ||
|- style="background-color:#bbddff;" | |- style="background-color:#bbddff;" | ||
! Plug-In !! Description | ! Plug-In !! Description | ||
|- | |||
| Storage Monitoring || [Default] | |||
|- | |||
| Service Status || [Default] Displays health of services on the VC | |||
|- | |||
| Hardware Status || [Default] Displays ESX hardware health (CIM monitoring) | |||
|- | |- | ||
| Update Manager || | | Update Manager || | ||
Line 235: | Line 353: | ||
* (Win) Sysprep must be installed on VC | * (Win) Sysprep must be installed on VC | ||
* (Linux) Guest OS must have Perl installed | * (Linux) Guest OS must have Perl installed | ||
'''vCenter Maps''' | |||
* Provide an overview of relationships for | |||
** Host Resources | |||
** VM Resources | |||
** Datastore Resources | |||
== Configure Access Control == | == Configure Access Control == | ||
Line 266: | Line 390: | ||
* VM Hardware v4 runs on ESX3 or ESX4, v7 runs on ESX4 only | * VM Hardware v4 runs on ESX3 or ESX4, v7 runs on ESX4 only | ||
* VM's running MS Windows should have SCSI TimoutValue changed to 60 secs to allow Windows to tolerate delayed SAN I/O from path failovers | * VM's running MS Windows should have SCSI TimoutValue changed to 60 secs to allow Windows to tolerate delayed SAN I/O from path failovers | ||
'''Disk Types''' | '''Disk Types''' | ||
* Thick - traditional (can convert to Thin via Storage vMotion) | * Thick - traditional (can convert to Thin via Storage vMotion) | ||
* Thin - minimal space usage (conversion to Thick | * Thin - minimal space usage (conversion to Thick requires VM downtime) | ||
Can't specify for NFS stores (controlled by the NFS server itself) | |||
'''Memory''' | '''Memory''' | ||
* Minimum of 4MB, increments of 4MB | * Minimum of 4MB, increments of 4MB | ||
* Maximum for best performance - threshold over which a VM's preformance will be degraded if memory size exceeded (varies dependant on load on ESX) | * Maximum for best performance - threshold over which a VM's preformance will be degraded if memory size exceeded (varies dependant on load on ESX) | ||
'''SCSI Controller Types''' | '''SCSI Controller Types''' | ||
Line 283: | Line 411: | ||
** Only VM h/ware v7 with Win2k3, Win2k8 or Red Hat Ent v5 | ** Only VM h/ware v7 with Win2k3, Win2k8 or Red Hat Ent v5 | ||
** Not supported with | ** Not supported with | ||
*** Record/replay | *** Record/replay | ||
*** Fault Tolerance | *** Fault Tolerance | ||
*** MSCS Clustering (so also SQL clusters) | *** MSCS Clustering (so also SQL clusters) | ||
*** ''[Boot disks - not an issue since ESX4.0 Update 1]'' | |||
'''N-port ID virtualization (NPIV)''' | '''N-port ID virtualization (NPIV)''' | ||
Line 293: | Line 422: | ||
* ESX's HBA's must support NPIV | * ESX's HBA's must support NPIV | ||
* NPIV enabled VM's are assigned 4 NPIV WWN's | * NPIV enabled VM's are assigned 4 NPIV WWN's | ||
* Storage vMotion is not supported | |||
'''vNICs''' | '''vNICs''' | ||
Line 299: | Line 430: | ||
* '''VMXNET2''' - Aka enhanced VMXNET, supports jumbo frames and TSO, limited OS support | * '''VMXNET2''' - Aka enhanced VMXNET, supports jumbo frames and TSO, limited OS support | ||
* '''VMXNET3''' - Performance driver, only supported on VM hardware v7, and limited OS's | * '''VMXNET3''' - Performance driver, only supported on VM hardware v7, and limited OS's | ||
'''VMDirectpath''' | |||
Allows direct access to PCI devices (aka passthrough devices), using inhibits | |||
* VMotion | |||
* Hot add | |||
* Suspend and resume, Record and replay | |||
* Fault Tolerance | |||
* HA | |||
An orange icon when trying to add a passthrough device indicates that the device has changed and the ESX must be bounced before it can be used. | |||
'''VMI Paravirtualisation''' | |||
Enables improved performance for supported VM (Linux only currently), by allowing VM to communicate with hypervisor | |||
* Uses 1 of VM's 6 vPCI slots | |||
* Must be supported by ESX (VM can be cold migrated to unsupported ESX, with perf hit) | |||
'''vCenter Converter''' | |||
Features/functionality... | |||
* P2V | |||
* Convert/import other format VM's (eg VMware Workstation, MS Virtual Server) | |||
* Convert 3rd party backup or disk images | |||
* Restore VCB backup images | |||
* Export VM's to other VMware VM formats | |||
* Make VM's bootable | |||
* Customise existing VM's | |||
Requires the following ports | |||
* Windows: TCP 139, 443, 445, 902 | |||
* Linux: TCP 22, 443, 902, 903 | |||
'''Guided Consolidation''' | |||
* Active Domains - Systems being analysed need to be a member of an active domain | |||
* Add to Analysis to analyse new systems, max 100 concurrent, can take 1hr for new analysis to start | |||
* Confidence - Degree to which VC collected perf data, and how good a candidate | |||
** High confidence is shown after 24 hrs, if workload varies over greater interval, further analysis is required | |||
* New VM's disk size = Amount used on physical x 1.25 | |||
* Convert manually to be able to specify new VM's settings | |||
== Manage Virtual Machines == | == Manage Virtual Machines == | ||
Line 305: | Line 479: | ||
* VM hardware version is v7 | * VM hardware version is v7 | ||
* vCPU's can only be added if "CPU Hot Plug" is enabled in the VM's options | * vCPU's can only be added if "CPU Hot Plug" is enabled in the VM's options | ||
'''Virtualized Memory Management Unit (MMU)''' | |||
* Maintains mapping between VM's guest OS ''physical'' memory to underlying hosts ''machine'' memory | |||
* Intercepts VM instructions that would manipulate memory, so that CPU's MMU is not updated directly. | |||
== Deploy vApps == | == Deploy vApps == | ||
vApp - An enhanced resource pool to run a contained group of VM's, can be created under the following conditions | |||
* A host is selected in the inventory that is running ESX3 or later | |||
* A DRS-enabled cluster is selected in the inventory | |||
* Name up to 80 chars | |||
'''Deploying an OVF template''' | '''Deploying an OVF template''' | ||
* Non-OVF format appliances can be converted using the VMware vCentre Converter module | * Non-OVF format appliances can be converted using the VMware vCentre Converter module | ||
Line 313: | Line 497: | ||
** Transient - VCentre manages a pool of available IP's | ** Transient - VCentre manages a pool of available IP's | ||
** DHCP | ** DHCP | ||
= Manage Compliance = | = Manage Compliance = | ||
Line 326: | Line 508: | ||
** <code> <patchStore> </code> - Location of downloaded patches (default - <code>C:\Documents and Settings\All Users\Application Data\VMware\VMware Update Manager\Data\</code> | ** <code> <patchStore> </code> - Location of downloaded patches (default - <code>C:\Documents and Settings\All Users\Application Data\VMware\VMware Update Manager\Data\</code> | ||
** <code> <PatchDepotUrl> </code> - URL used by ESX's to access patches (default - Update Manager server) | ** <code> <PatchDepotUrl> </code> - URL used by ESX's to access patches (default - Update Manager server) | ||
* '''Severity Levels''' | |||
** Not Applicable | |||
** Low | |||
** Moderate | |||
** Important | |||
** Critical | |||
** Host General | |||
** Host Security | |||
== Establish and Apply ESX Host Profiles == | == Establish and Apply ESX Host Profiles == | ||
* ESX 4 supported only | |||
* Used to ensure consistent configuration across ESX's | * Used to ensure consistent configuration across ESX's | ||
* Create a profile from a reference ESX, then apply to Cluster or ESX | * Create a profile from a reference ESX, then apply to Cluster or ESX | ||
** Reference ESX can be changed | ** Reference ESX can be changed | ||
** Profile can be refreshed (if reference ESX config has been updated) | ** Profile can be refreshed (if reference ESX config has been updated) | ||
* ESX must be in maintenance mode for a profile to be applied (resolve compliance discrepancies) | |||
* Can be imported/exported as .vpf files | |||
= Establish Service Levels = | = Establish Service Levels = | ||
Line 346: | Line 539: | ||
** ESX - All Service Console networks | ** ESX - All Service Console networks | ||
** ESXi - All VMkernel networks (not VMotion network if alternatives available) | ** ESXi - All VMkernel networks (not VMotion network if alternatives available) | ||
* Uses highest CPU and Memory reservation to generate a VM slot, which is used for capacity calculations | |||
'''Distributed Power Management''' | '''Distributed Power Management''' | ||
Line 361: | Line 555: | ||
== Enable a Fault Tolerant Virtual Machine == | == Enable a Fault Tolerant Virtual Machine == | ||
* vLockstep - Keeps Primary and Secondary VM's in sync | * vLockstep - Keeps Primary and Secondary VM's in sync | ||
* On-Demand Fault Tolerance - Temporary FT, configured for a VM during a critical time | * vLockstep Interval - Time required for secondary to sync with primary (normally < .5s ec) | ||
* Log Bandwidth - Bandwidth required to keep VM's in sync across FT network | |||
* On-Demand Fault Tolerance - Temporary manually managed FT, configured for a VM during a critical time | |||
* Recommenced max of 4 FT VM's per ESX (primary or secondary) | |||
'''Prerequisites''' | '''Prerequisites''' | ||
* Cluster | * Cluster | ||
** HA and host monitoring must be enabled | ** HA and host monitoring must be enabled (if monitoring isn't enabled new Secondary VM's aren't created) | ||
** Host certificate checking must be enabled | ** Host certificate checking must be enabled | ||
* ESX's | * ESX's | ||
Line 374: | Line 571: | ||
** Host BIOS must have Hardware Virtualisation (eg Intel VT) enabled | ** Host BIOS must have Hardware Virtualisation (eg Intel VT) enabled | ||
* VM's | * VM's | ||
** VMDK files must be thick provisioned with Cluster Features enabled | ** VMDK files must be thick provisioned with Cluster Features enabled and not Physical RDM | ||
** Run supported OS (generally all, may require reboot to enable FT) | ** Run supported OS (generally all, may require reboot to enable FT) | ||
Line 393: | Line 590: | ||
# Create HA Cluster and perform Profile Compliance | # Create HA Cluster and perform Profile Compliance | ||
# Turn on FT for appropriate VM's | # Turn on FT for appropriate VM's | ||
'''Not Protected''' caused by Secondary VM not running, because... | |||
* VM's are still starting up | |||
* Secondary VM cannot start, possible causes... | |||
** No suitable host on which start secondary | |||
** A fail-over has occurred but FT network link down, so new secondary not started | |||
* Disabled - FT has been disabled by user or VC (because no suitable secondary host can be found) | |||
* Primary VM is not on, so status is ''Not Protected, VM not Running'' | |||
== Create and Configure Resource Pools == | == Create and Configure Resource Pools == | ||
Line 402: | Line 607: | ||
* '''VMotion''' - VM is powered on. Moves VM, config and disk files are static | * '''VMotion''' - VM is powered on. Moves VM, config and disk files are static | ||
* '''Storage VMotion''' - VM is powered on. VM is static, config and disk files move | * '''Storage VMotion''' - VM is powered on. VM is static, config and disk files move | ||
'''VMotion Priority''' | |||
* '''High''' - Resources are reserved on source and destination ESX's prior to move. Move may not proceed. | |||
* '''Low''' - No reservation made, just proceeds. Migrations likely to take longer and may cause VM to become unavailable for a period of time | |||
== Backup and Restore Virtual Machines == | == Backup and Restore Virtual Machines == | ||
Line 407: | Line 616: | ||
* Can quiesce guest file system (req VMTools) to ensure consistent disk state | * Can quiesce guest file system (req VMTools) to ensure consistent disk state | ||
* Independent disks are excluded from snapshots (Persistent writes to disk, Nonpersistent writes to redo log, discarded at power off) | * Independent disks are excluded from snapshots (Persistent writes to disk, Nonpersistent writes to redo log, discarded at power off) | ||
* '''Migrating a VM with Snapshots''' | |||
** Cannot use Storage VMotion | |||
** All VM files must reside in single directory if being moved by cold storage migration | |||
** Reversion after VMotion may cause VM to fail - only occurs if discrepancies in ESX hardware | |||
'''VMware Data Recovery''' | '''VMware Data Recovery''' | ||
Line 414: | Line 627: | ||
* Max 8 VM backups can run concurrently | * Max 8 VM backups can run concurrently | ||
* Max 2 backup destinations used concurrently | * Max 2 backup destinations used concurrently | ||
* Max 100 VM's per back appliance | * Max 100 VM's per back up appliance | ||
* Backup's won't start if ESX CPU usage >90% | |||
'''VMware Data Recovery Setup''' | '''VMware Data Recovery Setup''' | ||
Line 426: | Line 640: | ||
* '''Physical switchport failover''' - Use PortFast to ensure a VM's MAC appearing on a different switchport is handled quickly | * '''Physical switchport failover''' - Use PortFast to ensure a VM's MAC appearing on a different switchport is handled quickly | ||
* '''Port Group Reconfiguration''' - Renaming a Port Group will mean connected VM's will loose their PortGroup config | * '''Port Group Reconfiguration''' - Renaming a Port Group will mean connected VM's will loose their PortGroup config | ||
* '''Hardware Health Service''' - VI Client plugin that uses an IE object to access the info on vCentre | |||
'''Export Diagnostic Data''' | |||
To generate a diagnostic data report... | |||
* Run <code> vm-support </code> script on ESX | |||
* Run '''Administrator | Export Diagnostic''' info on VI Client | |||
== Perform Basic Troubleshooting for VMware FT and Third-Party Clusters == | == Perform Basic Troubleshooting for VMware FT and Third-Party Clusters == | ||
'''Unexpected FT Failovers''' | |||
* Partial Hardware Failure Related to Storage - Caused by one ESX experiencing problems accessing VM's storage | |||
* Partial Hardware Failure Related to Network - Caused by FT logging NIC being congested or down | |||
* Insufficient Bandwidth on the Logging NIC Network - Caused by too many FT VM's on the same ESX | |||
* VMotion Failures Due to Virtual Machine Activity Level - VM is too active for VMotion to succeed | |||
* Too Much Activity on VMFS Volume Can Lead to Virtual Machine Failovers - Too many file system locking operations (VM power on/off's etc) | |||
* Lack of File System Space Prevents Secondary VM Startup | |||
'''Other FT Errors''' | |||
* Hardware Virtualization Must Be Enabled - HV (ie VT/AMD-V) must be enabled to allow FT | |||
* Compatible Secondary Hosts Must Be Available - No spare ESX's with HV, capacity, not in Maintenance mode etc | |||
* Very Large Virtual Machines Can Prevent Use of Fault Tolerance - If memory is large (>15GB) or changing too much, VMotion will not be able to keep in sync, can increase time-out value (def 8 sec -> 30 secs) <code> ft.maxSwitchoverSeconds = "30" </code> entered in VM's VMX file | |||
* Secondary VM CPU Usage Appears Excessive - Replaying some events can be more expensive than recording on Primary, normal operation | |||
== Perform Basic Troubleshooting for Networking == | == Perform Basic Troubleshooting for Networking == | ||
== Perform Basic Troubleshooting for Storage == | == Perform Basic Troubleshooting for Storage == | ||
Line 437: | Line 672: | ||
[[Category:VMware]] | [[Category:VMware]] | ||
[[Category:VCP]] |
Latest revision as of 21:56, 16 June 2013
Other Resources
- VMware VCP Forum
- VCP4 Blueprint
- VMware vSphere Documentation: PDF HTML (HTML version is good for searching)
- Scott Vessey
- Simon Long blog
Plan, Install and Upgrade VMware ESX/ESXi
Install VMware ESX/ESXi on local storage
Minimum Hardware Requirements
- 64bit CPU (AMD Opteron, Intel Xenon [inc Nahalem])
- CPU Virtualisation features required to support 64bit VM's
- 2GB RAM
- 1+ NIC
- SCSI, Fibre Channel or Internal RAID controller
- LUN, SAS or SATA (SATA must be connected through a SAS controller)
Notes
- ESX's hardware clock should be set to UTC
- IPv6 not supported during installation
ESXi Specifics
- All blank internal disks are formatted with VMFS (except 4GB VFAT scratch/swap partition, used for vm-support dumps)
- Direct Console is used to
- Configure host defaults
- Set-up administrator access
- Troubleshoot
- If no DHCP found during install, default 169.254.0.0 / 16 address assigned
Required Partitions
Mount | Type | Size | Description |
---|---|---|---|
/boot |
ext3 | 1100MB | Boot disk req 1.25GB free space, includes /boot and vmkcore partitions
|
swap | 600MB | ESX swap, 600MB req, 1.6GB max (use default at install) | |
/ |
ext3 | 5 GB | ESX OS and services, 3rd party apps |
VMFS | 1200MB | Service Console (esxconsole.vmdk), not ESXi | |
vmkcore |
1.25GB | Core debugging dumps |
Optional Partitions
Mount | Type | Size | Description |
---|---|---|---|
/home |
ext3 | 512MB | ESX user accounts |
/tmp |
ext3 | 1024MB | Temp files! |
/usr |
ext3 | User programs and data (3rd party apps) | |
/var/log |
ext3 | 2000MB | Log files |
vSphere Editions
Edition | Features |
---|---|
Essentials | 6 cores/CPU, 4way vSMP, 256GB/ESX, VC Agent, Update Manager, VMsafe, vStorage API's |
Essentials Plus | Essentials + Data Recovery |
Standard | Essentials + HA |
Advanced | Standard + 12 cores/CPU, Hot Add, FT, VMotion, vShield, Data Recovery |
Enterprise | Advanced + 6 cores/CPU, Storage vMotion, Data Recovery, DRS |
Enterprise Plus | 12 cores/CPU, 8way vSMP, 1TB/ESX, vNetwork Distributed Switch, Host Profiles, 3rd Party Multipathing |
vCentre Foundation | Fully featured, but limited to managing 3 ESX's |
Upgrade VMware ESX/ESXi
Prerequisites
/boot
partition must be at least 100 MB
Pre-Upgrade Backups
- Backup ESX Host Config
- Back up the files in the
/etc/passwd
,/etc/groups
,/etc/shadow
, and/etc/gshadow
directories (shadow dir's may not exist). - Backup any custom scripts
- Backup any
.vmx
files - Backup any local images etc on local VMFS
- Back up the files in the
- Backup ESXi Host Config
- Use vSphere CLI to run
vicfg-cfgbackup --server <ESXi-host-ip> --portnumber <port_number> --protocol <protocol_type> --username username --password <password> -s <backup-filename>
- Use vSphere CLI to run
- VM backup
- Snapshot before upgrade
Upgrade Scenarios
Method | Notes |
---|---|
with Host Clusters | Use Update Manager. Upgrade VC, Update Manager, ESX, VM, licenses |
without Host Clusters | Use vSphere Host Update Utility (good for estates < 10 ESX's), runs from VC Client |
vMotion | Migrate VM's from ESX v3 to v4, then perform required VM upgrade |
Upgrade vMotion | When upgrading from ESX v2, VM's are migrated from VMFS v2 to v3 and upgraded |
Cold migration (with VC) | Move VM's through VC to v4 ESX's and power-up, then upgrade VM |
Cold migration (without VC) | Manually move VM's to v4 ESX's and power-up, then upgrade VM |
VC on new machine | Backup DB, copy across SSL folder to new machine, run install |
ESX/ESXi Upgrade
- DHCP not recommended
- Limited support for v2.5.5, all later versions fully supported
- Need to specify a local VMFS for Service Console VM (not ESXi)
Rollback
- ESX
- Run
rollback-to-esx3
command in Service Console, delete ESX v4 Service Console following restart - Restore backed up files
- Run
- ESXi
- During boot, press Shift + R to boot into the Standby (ESX3) build
- Restore backup using
vicfg-cfgbackup -l
Secure VMware ESX/ESXi
- ESX firewall - primary source of protection for Service Console
- Weak ciphers are disabled, all communications are secured by SSL certificates
- Tomcat Web service has been modified to limited functionality (to avoid general Tomcat vulnerabilities)
- Insecure services (eg FTP, Telnet) are not installed, and ports blocked by the firewall
- TCP 443 - Service Console, vmware-authd
- TCP 902 - VMkernel, vmkauthd
Install VMware ESX/ESXi on SAN Storage
Boot from SAN
- HBA must be located in lowest PCI bus and slot number
- HBA BIOS must designate the FC card as a boot controller
- The FC card must initiate a primative connection to the boot LUN
- Each ESX must have its own boot LUN
- SAN storage paths can be masked using
esxcli corestorage claimrule
(PSA claim) rules to select which available LUN's are claimed
- SAN storage paths can be masked using
- iSCSI must use a hardware initiator (impossible to boot using software iSCSI)
FC boot from SAN set-up
- Configure/create boot LUN
- Enable boot from HBA in system's BIOS and in HBA's BIOS
- Select the LUN to boot from in HBA BIOS
iSCSI boot from SAN set-up
- Configure storage ACL so that only correct ESX has access to correct boot LUN (must be LUN 0 or LUN 255)
- Enable boot from HBA in system's BIOS and in HBA's BIOS
- Configure target to boot from in HBA's BIOS
Identify vSphere Architecture and Solutions
Platforms
- vSphere 4
- Server
- ESXi (standalone, free)
vSphere Features etc
- VMsafe - API to enable 3rd party security products to control and protect
- Memory and CPU - Introspection of VM memory pages and CPU states
- Networking - Filtering of packets inside hypervisor (vSwitches)
- Process Execution - In guest (VM), in process API's effectively allowing monitoring and control of process execution (agent-less AV)
- Storage - VM disks can be mounted etc (agent-less AV)
- vShield - Appliance utilising VMsafe to provide security and compliance
Datacentre Solutions
- View - (VDI) Desktop virtualisation
- SRM - Site Recovery Manager, automate site fail-over/recovery, DR management
- Lab Manager - VM manager for developers, allows dev's to rapidly deploy VM images for testing etc
- Stage Manager - Being consolidated into Lab Manager
Configure ESX/ESXi Networking
Configure Virtual Switches
Nothing new !!
Configure vNetwork Distributed Switches
- dvSwitch - Distributed Virtual Switch (DVS) which spans numerous ESX's
- dvPort - A dvSwitch Service Console, VMkernel, or VM Port Group port
dvSwitch Advanced Settings...
- CDP (not set/overridable on uplink ports)
dvPortGroup Settings
- Port Binding
- Static - (default) Assign port when VM connects to switch
- Dynamic - Assign port when VM is powered on
- Ephemeral - No port binding (classic switch method)
- Live port moving - ??? Seems to be a CLI feature ???
- Config reset at disconnect - Discard per-port config when a VM is disconnected
- Binding on host allowed - Allows ESX to assign dvPorts when not connected to vCentre
VLAN Options
- None - Straight-through connected switch
- VLAN - Traditional single VLAN assignment to a port group
- VLAN Trunking - Multiple VLAN's can be assigned to a dv Port Group
- Private VLAN - Allows Private VLANs
- VLANs over a VLAN, the VLAN equivalent of subnetting. Hosts on differing subVLANs may be in same IP range, but need to go via router to communicate.
- Primary (promiscuous) VLAN uplinks to rest of network
- Ssee http://blog.internetworkexpert.com/2008/07/14/private-vlans-revisited/
Traffic Shaping
- Can be applied to both inbound and outbound traffic
- Can be set per dvPort (dvPort Group must allow overrides)
Service Console ports
Options to create a SC port...
- Add a new Service Console virtual adapter
- Migrate an existing SC adapter to a dvPort Group or dvPort
Configure VMware ESX/ESXi Management Network
Configure ESX/ESXi Storage
Configure FC SAN Storage
Storage Device Naming
- Name - A friendly name based on storage type and manufacturer. User changeable, kept consistent across ESX's
- Identifier - Globally unique, human unintelligible. Persistent through reboot and consistent across ESX's
- Runtime Name - The first path to a device, created by host and not persistent. Of format
vmhba#:C#:T#:L#
- vmhba - Storage Adapter number
- C - Storage Channel number (software iSCSI uses this to represent multiple paths to same target)
- T - Target
- L - LUN (provided by storage system; if only 1 LUN its always L0)
PSA - Pluggable Storage Architecture
- Manages storage multipathing
- Allows simultaneous operation of multiple multipathing plugins (MPPs)
- Native Multipathing Plugin (NMP) provided by default, can have sub-plugins (can be either VMware or 3rd party)
- Storage Array Type Plugin (SATP) - unique to a particular array (effectively an array driver, like a standard PC hardware driver)
- Path Selection Plugin (PSP) - default assigned by NMP based on the SATP
- Multipathing Plugin (MPP) - 3rd party, can run alongside or in addition to Native Multipathing Plugin,
PSA operations
- Loads and unloads multipathing plugins
- Hides VM specifics from a particular plugin
- Routes I/O requests for a specific logical device to the MPP managing that device
- Handles I/O queuing to the logical devices
- Implements logical devices bandwidth between VM's
- Handles I/O queueing to the physical storage HBA's
- Handles physical path discovery and removal
- Provides logical device and physical path I/O stats
MPP / NMP operations
- Manage physical path (un)claiming
- Manage creation, and (de)registration of logical devices
- Associate physical paths with logic volumes
- Process I/O requests to logical devices
- Select an optimal physical path for the request
- Depending on storage device, perform specific actions necessary to handle path failures and I/O cmd retries
- Support management tasks, EG abort or reset of logical devices
PSP types
Default (VMware) PSP Types (3rd party PSP's can be installed)...
- Most Recently Used - Good for either Active/Active or Active/Passive
- Fixed - Can cause path thrashing when used with Active/Passive
- Round Robin - Load balanced
PSA Claim Rules
- Used to define paths should be used by a particular plugin module
LUN Masking
- Used to prevent an ESX from seeing LUN's or using individual paths to a LUN
- Add and load a claim rule to apply
Configure iSCSI SAN Storage
Most of the FC SAN Storage info above is also applicable here
CHAP Authentication
- One-way CHAP - Unidirectional, iSCSI target authenticates the initiator (ESX) only
- Mutual CHAP - Bidirectional, ESX also authenticates the iSCSI target (Software iSCSI only)
Multipathing (software iSCSI)
- Set-up a vSwitch with two VMkernel ports and two uplinks
- For each VMkernel port, edit NIC Teaming | Override vSwitch failover order to bind one uplink each
- Connect the iSCSI initiator to each VMkernel port
esxcli swiscsi nic add -n <vmk_port_name> -d <vmhba_no>
Configure NFS Datastores
- ESX supports NFS v3 on TCP only
- ESX's manage exclusive access to files via
.lc-XXX
lock files - To use jumbo frames, enable on the vSwitch and the VMkernel port(s)
- Frames up to 9kB are supported
Configure and Manage VMFS Datastores
- VMFS Datastore capacity can be increased on the fly whilst VM's are running (from that datastore)
Install and Configure vCenter Server
Install vCenter Server
Minimum Requirements
- 2x CPU's (2GHz)
- 3GB RAM
- 2GB disk
- Microsoft SQL2005 Express
Scale | VC | CPU | Memory |
---|---|---|---|
50 ESXs, 250 VMs | 32 bit | 2 | 4 GB |
200 ESXs, 2000 VMs | 64 bit | 4 | 4 GB |
300 ESXs, 3000 VMs | 64 bit | 4 | 8 GB |
- Database must be 32bit only, regardless of VC's OS (default database on 64bit SQL is 64bit)
Manage vSphere Client plug-ins
Plug-In | Description |
---|---|
Storage Monitoring | [Default] |
Service Status | [Default] Displays health of services on the VC |
Hardware Status | [Default] Displays ESX hardware health (CIM monitoring) |
Update Manager | |
Converter Enterprise | |
vShield Zones | App aware firewall, inspects client-server and inter-VM traffic to provide traffic analysis and app-aware firewall partitioning |
Orchestrator | Workflow engine to manage automated tasks/workflows |
Data Recovery | Backup and recovery. Centralised management of backup tasks (inc data de-duplication). |
Configure vCenter Server
Guest Customisation Requirements
- Source machine must have
- VMTools installed (latest version)
- Similar OS to intended new machine
- SCSI disks
- (Win) Guest OS cannot be a domain controller
- (Win) Sysprep must be installed on VC
- (Linux) Guest OS must have Perl installed
vCenter Maps
- Provide an overview of relationships for
- Host Resources
- VM Resources
- Datastore Resources
Configure Access Control
Role | Type | ESX / VC | Description |
---|---|---|---|
No Access | System | ESX & VC | No view or do. Can be used to stop permissions propagating. |
Read Only | System | ESX & VC | View all except Console, no do. |
Administrator | System | ESX & VC | Full rights |
VM User | Sample | VC only | VM start/stop, console, insert media (CD) |
VM Power User | Sample | VC only | As user plus hardware and snapshot operations |
Resource Pool Admin | Sample | VC Only | Akin to an OU admin, full rights for child objects
Cannot create new VM's without additional VM and datastore privileges. |
VCB User | Sample | VC Only | Expected to be used by VCB, do not modify! |
Datastore Consumer | Sample | VC Only | Allows creation of VMDK's or snapshots in datastore (additional VM privileges to action) |
Network Consumer | Sample | VC Only | Allows assignment of VM's to networks (additional VM privileges to action) |
Deploy and Manage Virtual Machines and vApps
Create and Deploy Virtual Machines
- VM Hardware v4 runs on ESX3 or ESX4, v7 runs on ESX4 only
- VM's running MS Windows should have SCSI TimoutValue changed to 60 secs to allow Windows to tolerate delayed SAN I/O from path failovers
Disk Types
- Thick - traditional (can convert to Thin via Storage vMotion)
- Thin - minimal space usage (conversion to Thick requires VM downtime)
Can't specify for NFS stores (controlled by the NFS server itself)
Memory
- Minimum of 4MB, increments of 4MB
- Maximum for best performance - threshold over which a VM's preformance will be degraded if memory size exceeded (varies dependant on load on ESX)
SCSI Controller Types
- BusLogic Parallel
- LSI Logic SAS
- LSI Logic Parallel
- VMware Paravirtual
- High performance to provide better throughput with lower ESX CPU usage
- Only VM h/ware v7 with Win2k3, Win2k8 or Red Hat Ent v5
- Not supported with
- Record/replay
- Fault Tolerance
- MSCS Clustering (so also SQL clusters)
- [Boot disks - not an issue since ESX4.0 Update 1]
N-port ID virtualization (NPIV)
- Provides VM's with RDM's unconstrained to an ESX (ie allows VMotion when using RDM's)
- Must be enabled on SAN switch
- ESX's HBA's must support NPIV
- NPIV enabled VM's are assigned 4 NPIV WWN's
- Storage vMotion is not supported
vNICs
- Flexible - Becomes VMXNET when on 32bit OS with VMTools installed (VMware optimised), otherwise vLANCE (old AMD LANCE 10MB NIC driver)
- e1000 - Default for 64bit OS's, emulates an Intel E1000 card
- VMXNET2 - Aka enhanced VMXNET, supports jumbo frames and TSO, limited OS support
- VMXNET3 - Performance driver, only supported on VM hardware v7, and limited OS's
VMDirectpath
Allows direct access to PCI devices (aka passthrough devices), using inhibits
- VMotion
- Hot add
- Suspend and resume, Record and replay
- Fault Tolerance
- HA
An orange icon when trying to add a passthrough device indicates that the device has changed and the ESX must be bounced before it can be used.
VMI Paravirtualisation
Enables improved performance for supported VM (Linux only currently), by allowing VM to communicate with hypervisor
- Uses 1 of VM's 6 vPCI slots
- Must be supported by ESX (VM can be cold migrated to unsupported ESX, with perf hit)
vCenter Converter
Features/functionality...
- P2V
- Convert/import other format VM's (eg VMware Workstation, MS Virtual Server)
- Convert 3rd party backup or disk images
- Restore VCB backup images
- Export VM's to other VMware VM formats
- Make VM's bootable
- Customise existing VM's
Requires the following ports
- Windows: TCP 139, 443, 445, 902
- Linux: TCP 22, 443, 902, 903
Guided Consolidation
- Active Domains - Systems being analysed need to be a member of an active domain
- Add to Analysis to analyse new systems, max 100 concurrent, can take 1hr for new analysis to start
- Confidence - Degree to which VC collected perf data, and how good a candidate
- High confidence is shown after 24 hrs, if workload varies over greater interval, further analysis is required
- New VM's disk size = Amount used on physical x 1.25
- Convert manually to be able to specify new VM's settings
Manage Virtual Machines
VM hardware can be modified in-flight as long as
- The guest OS supports hot plug (eg Win2008)
- VM hardware version is v7
- vCPU's can only be added if "CPU Hot Plug" is enabled in the VM's options
Virtualized Memory Management Unit (MMU)
- Maintains mapping between VM's guest OS physical memory to underlying hosts machine memory
- Intercepts VM instructions that would manipulate memory, so that CPU's MMU is not updated directly.
Deploy vApps
vApp - An enhanced resource pool to run a contained group of VM's, can be created under the following conditions
- A host is selected in the inventory that is running ESX3 or later
- A DRS-enabled cluster is selected in the inventory
- Name up to 80 chars
Deploying an OVF template
- Non-OVF format appliances can be converted using the VMware vCentre Converter module
- During deployment IP allocation can be (if OVF templates states this is configurable)
- Fixed
- Transient - VCentre manages a pool of available IP's
- DHCP
Manage Compliance
Install, Configure and Manage VMware vCenter Update Manager
- Update Manager can be installed on VC, recommended separate for large environments
- Requires its own db instance (can be on same server as VC database, recommended separate)
- Requires sysadmin or db_owner role
- VMware vCenter Update Manager Guest Agent is installed to Win or Linux guests on 1st patch scan or remediation run.
- Smart Rebooting - Update manager attempts to adhere to the startup dependencies stated in a vApp config
- Edit
vci-integrity.xml
to change<patchStore>
- Location of downloaded patches (default -C:\Documents and Settings\All Users\Application Data\VMware\VMware Update Manager\Data\
<PatchDepotUrl>
- URL used by ESX's to access patches (default - Update Manager server)
- Severity Levels
- Not Applicable
- Low
- Moderate
- Important
- Critical
- Host General
- Host Security
Establish and Apply ESX Host Profiles
- ESX 4 supported only
- Used to ensure consistent configuration across ESX's
- Create a profile from a reference ESX, then apply to Cluster or ESX
- Reference ESX can be changed
- Profile can be refreshed (if reference ESX config has been updated)
- ESX must be in maintenance mode for a profile to be applied (resolve compliance discrepancies)
- Can be imported/exported as .vpf files
Establish Service Levels
Create and Configure VMware Clusters
VM Monitoring
- HA monitors VM to detect if they've hung / stopped responding, and resets VM if both
- VM Tools heartbeat lost in interval
- No VM I/O in interval (default 120 secs, reconfig at cluster level
das.iostatsInterval
- Default 60 secs no h/beat, max 3 resets in 24 hrs (High sensitivity 30 secs and 1hr, Low 120 secs and 7 days)
- VM Monitoring should be suspending during network changes
High Availability
- Uses the following networks for HA communication
- ESX - All Service Console networks
- ESXi - All VMkernel networks (not VMotion network if alternatives available)
- Uses highest CPU and Memory reservation to generate a VM slot, which is used for capacity calculations
Distributed Power Management
- Uses current load and VM resource reservation to calculate required number of powered-up ESXs
- ESX power-on achieved by WOL, IPMI or iLO
- IMPI or iLO: Must specify IP, MAC etc for each ESX
- WOL: VMotion NIC must support WOL, and VMotion switchport must be set to Auto (as WOL often not supported by NIC at 1GB)
- Must test ESX in and out of Standby Mode before enabling DPM
Enhanced VMotion Compatibility
- Hides additional CPU features in a cluster (ie features one ESX in a cluster has but another doesn't)
- Requires no VM's to be running on the cluster (as the CPU type will effectively be changed)
- Generally works for similar manufacture make & model CPU's with different stepping levels
Enable a Fault Tolerant Virtual Machine
- vLockstep - Keeps Primary and Secondary VM's in sync
- vLockstep Interval - Time required for secondary to sync with primary (normally < .5s ec)
- Log Bandwidth - Bandwidth required to keep VM's in sync across FT network
- On-Demand Fault Tolerance - Temporary manually managed FT, configured for a VM during a critical time
- Recommenced max of 4 FT VM's per ESX (primary or secondary)
Prerequisites
- Cluster
- HA and host monitoring must be enabled (if monitoring isn't enabled new Secondary VM's aren't created)
- Host certificate checking must be enabled
- ESX's
- Separate VMotion and FT Logging NIC(s) configured (should be different subnets for each)
- Same ESX software version and patch level (FT must be temporarily disabled during ESX software upgrades)
- FT-compatible processor
- Host certified by OEM as FT-capable
- Host BIOS must have Hardware Virtualisation (eg Intel VT) enabled
- VM's
- VMDK files must be thick provisioned with Cluster Features enabled and not Physical RDM
- Run supported OS (generally all, may require reboot to enable FT)
Unsupported
- Snapshots (must be removed/committed before FT enabled)
- Storage VMotion
- DRS
- SMP - Only single vCPU supported
- Physical RDM
- CD-ROM or Floppy media/ISO not on shared storage
- Paravirtualised guests
- NPIV
- NIC Passthrough
Setup
- Enable host certificate checking
- Configure VMkernel networking
- Create HA Cluster and perform Profile Compliance
- Turn on FT for appropriate VM's
Not Protected caused by Secondary VM not running, because...
- VM's are still starting up
- Secondary VM cannot start, possible causes...
- No suitable host on which start secondary
- A fail-over has occurred but FT network link down, so new secondary not started
- Disabled - FT has been disabled by user or VC (because no suitable secondary host can be found)
- Primary VM is not on, so status is Not Protected, VM not Running
Create and Configure Resource Pools
Nothing new!
Migrate Virtual Machines
- Cold Migration - VM is powered off, can be migrated to another datacentre
- Suspended VM Migration - Config and disk files can be relocated, can be migrated to another datacentre
- VMotion - VM is powered on. Moves VM, config and disk files are static
- Storage VMotion - VM is powered on. VM is static, config and disk files move
VMotion Priority
- High - Resources are reserved on source and destination ESX's prior to move. Move may not proceed.
- Low - No reservation made, just proceeds. Migrations likely to take longer and may cause VM to become unavailable for a period of time
Backup and Restore Virtual Machines
Snapshots
- Can quiesce guest file system (req VMTools) to ensure consistent disk state
- Independent disks are excluded from snapshots (Persistent writes to disk, Nonpersistent writes to redo log, discarded at power off)
- Migrating a VM with Snapshots
- Cannot use Storage VMotion
- All VM files must reside in single directory if being moved by cold storage migration
- Reversion after VMotion may cause VM to fail - only occurs if discrepancies in ESX hardware
VMware Data Recovery
- Built on VMware vStorage API for Data Protection
- Can store backup on any ESX supported virtual disk, or SAN, NAS, or CIFS storage
- All stored in deduplicated store
- Max 8 VM backups can run concurrently
- Max 2 backup destinations used concurrently
- Max 100 VM's per back up appliance
- Backup's won't start if ESX CPU usage >90%
VMware Data Recovery Setup
- Install VI Client plugin (needs to be able communicate with backup appliances on TCP 22024)
- Install/import VMware Data Recovery OVF/appliance
- Add VMDK to appliance (to be used as backup destination, network stores can be used, but VMDK's are faster)
Perform Basic Troubleshooting and Alarm Management
Perform Basic Troubleshooting for ESX/ESXi Hosts
- Service Console Networking - Use
esxcfg-vswif, esxcfg-vswitch, esxcfg-nics
- Physical switchport failover - Use PortFast to ensure a VM's MAC appearing on a different switchport is handled quickly
- Port Group Reconfiguration - Renaming a Port Group will mean connected VM's will loose their PortGroup config
- Hardware Health Service - VI Client plugin that uses an IE object to access the info on vCentre
Export Diagnostic Data
To generate a diagnostic data report...
- Run
vm-support
script on ESX - Run Administrator | Export Diagnostic info on VI Client
Perform Basic Troubleshooting for VMware FT and Third-Party Clusters
Unexpected FT Failovers
- Partial Hardware Failure Related to Storage - Caused by one ESX experiencing problems accessing VM's storage
- Partial Hardware Failure Related to Network - Caused by FT logging NIC being congested or down
- Insufficient Bandwidth on the Logging NIC Network - Caused by too many FT VM's on the same ESX
- VMotion Failures Due to Virtual Machine Activity Level - VM is too active for VMotion to succeed
- Too Much Activity on VMFS Volume Can Lead to Virtual Machine Failovers - Too many file system locking operations (VM power on/off's etc)
- Lack of File System Space Prevents Secondary VM Startup
Other FT Errors
- Hardware Virtualization Must Be Enabled - HV (ie VT/AMD-V) must be enabled to allow FT
- Compatible Secondary Hosts Must Be Available - No spare ESX's with HV, capacity, not in Maintenance mode etc
- Very Large Virtual Machines Can Prevent Use of Fault Tolerance - If memory is large (>15GB) or changing too much, VMotion will not be able to keep in sync, can increase time-out value (def 8 sec -> 30 secs)
ft.maxSwitchoverSeconds = "30"
entered in VM's VMX file - Secondary VM CPU Usage Appears Excessive - Replaying some events can be more expensive than recording on Primary, normal operation