VCP4
Other Resources
- VMware VCP Forum
- VCP4 Blueprint
- VMware vSphere Documentation: PDF HTML
- Simon Long blog
Plan, Install and Upgrade VMware ESX/ESXi
Install VMware ESX/ESXi on local storage
Minimum Hardware Requirements
- 64bit CPU (AMD Opteron, Intel Xenon [inc Nahalem])
- CPU Virtualisation features required to support 64bit VM's
- 2GB RAM
- 1+ NIC
- SCSI, Fibre Channel or Internal RAID controller
- LUN, SAS or SATA (SATA must be connected through a SAS controller)
Notes
- ESX's hardware clock should be set to UTC
- IPv6 not supported during installation
ESXi Specifics
- All blank internal disks are formatted with VMFS (except 4GB VFAT scratch/swap partition, used for vm-support dumps)
- Direct Console is used to
- Configure host defaults
- Set-up administrator access
- Troubleshoot
- If no DHCP found during install, default 169.254.0.0 / 16 address assigned
Required Partitions
Mount | Type | Size | Description |
---|---|---|---|
/boot |
ext3 | 1100MB | Boot disk req 1.25GB free space, includes /boot and vmkcore partitions
|
swap | 600MB | ESX swap, 600MB req, 1.6GB max (use default at install) | |
/ |
ext3 | 5 GB | ESX OS and services, 3rd party apps |
VMFS | 1200MB | Service Console (esxconsole.vmdk), not ESXi | |
vmkcore |
1.25GB | Core debugging dumps |
vSphere Editions
Edition | Features |
---|---|
Essentials | 6 cores/CPU, 4way vSMP, 256GB/ESX, VC Agent, Update Manager, VMsafe, vStorage API's |
Essentials Plus | Essentials + Data Recovery |
Standard | Essentials + HA |
Advanced | Standard + 12 cores/CPU, Hot Add, FT, vShield, VMotion, Data Recovery |
Enterprise | Advanced + 6 cores/CPU, Storage vMotion, Data Recovery, DRS |
Enterprise Plus | 12 cores/CPU, 8way vSMP, maxGB/ESX, vNetwork Distributed Switch, Host Profiles, 3rd Party Multipathing |
vCentre Foundation | Fully featured, but limited to managing 3 ESX's |
Upgrade VMware ESX/ESXi
Pre-Upgrade Backups
- Backup ESX Host Config
- Back up the files in the
/etc/passwd
,/etc/groups
,/etc/shadow
, and/etc/gshadow
directories (shadow dir's may not exist). - Backup any custom scripts
- Backup any
.vmx
files - Backup any local images etc on local VMFS
- Back up the files in the
- Backup ESXi Host Config
- Use vSphere CLI to run
vicfg-cfgbackup --server <ESXi-host-ip> --portnumber <port_number> --protocol <protocol_type> --username username --password <password> -s <backup-filename>
- Use vSphere CLI to run
- VM backup
- Snapshot before upgrade
Upgrade Scenarios
Method | Notes |
---|---|
with Host Clusters | Use Update Manager. Upgrade VC, Update Manager, ESX, VM, licenses |
without Host Clusters | Use vSphere Host Update Utility (good for estates < 10 ESX's), runs from VC Client |
vMotion | Migrate VM's from ESX v3 to v4, then perform required VM upgrade |
Upgrade vMotion | When upgrading from ESX v2, VM's are migrated from VMFS v2 to v3 and upgraded |
Cold migration (with VC) | Move VM's through VC to v4 ESX's and power-up, then upgrade VM |
Cold migration (without VC) | Manually move VM's to v4 ESX's and power-up, then upgrade VM |
VC on new machine | Backup DB, copy across SSL folder to new machine, run install |
ESX/ESXi Upgrade
- DHCP not recommended
- Limited support for v2.5.5, all later versions fully supported
- Need to specify a local VMFS for Service Console VM (not ESXi)
Rollback
- ESX
- Run
rollback-to-esx3
command in Service Console, delete ESX v4 Service Console following restart - Restore backed up files
- Run
- ESXi
- During boot, press Shift + R to boot into the Standby (ESX3) build
- Restore backup using
vicfg-cfgbackup -l
Secure VMware ESX/ESXi
- ESX firewall - primary source of protection for Service Console
- Weak ciphers are disabled, all communications are secured by SSL certificates
- Tomcat Web service has been modified to limited functionality (to avoid general Tomcat vulnerabilities)
- Insecure services (eg FTP, Telnet) are not installed, and ports blocked by the firewall
- TCP 443 - Service Console, vmware-authd
- TCP 902 - VMkernel, vmkauthd
Install VMware ESX/ESXi on SAN Storage
To boot from SAN...
- HBA BIOS must designate the FC card as a boot controller
- The FC card must initiate a primative connection to the boot LUN
- Each ESX must have its own boot LUN
- SAN storage paths can be masked using
esxcli corestorage claimrule
(PSA claim) rules to select which available LUN's are claimed
- SAN storage paths can be masked using
- iSCSI must use a hardware initiator (impossible to boot using software iSCSI)
To setup FC boot from SAN...
- Configure/create boot LUN
- Enable boot from HBA in system's BIOS and in HBA's BIOS
- Select the LUN to boot from in HBA BIOS
To setup iSCSI boot from SAN...
- Configure storage ACL so that only correct ESX has access to correct boot LUN (must be LUN 0 or LUN 255)
- Enable boot from HBA in system's BIOS and in HBA's BIOS
- Configure target to boot from in HBA's BIOS
Identify vSphere Architecture and Solutions
Platforms
- vSphere 4
- Server
- ESXi (standalone, free)
Datacentre Solutions
- View - (VDI) Desktop virtualisation
- SRM - Site Recovery Manager, automate site fail-over/recovery, DR management
- Lab Manager - VM manager for developers, allows dev's to rapidly deploy VM images for testing etc
- Stage Manager - Being consolidated into Lab Manager
Configure ESX/ESXi Networking
Configure Virtual Switches
Nothing new !!
Configure vNetwork Distributed Switches
- dvSwitch - Distributed Virtual Switch (DVS) which spans numerous ESX's
- dvPort - A dvSwitch Service Console, VMkernel, or VM Port Group port
dvSwitch Advanced Settings...
- CDP (not set/overridable on uplink ports)
dvPortGroup Settings
- Port Binding
- Static - (default) Assign port when VM connects to switch
- Dynamic - Assign port when VM is powered on
- Ephemeral - No port binding (classic switch method)
- Live port moving - ??? Seems to be a CLI feature ???
- Config reset at disconnect - Discard per-port config when a VM is disconnected
- Binding on host allowed - Allows ESX to assign dvPorts when not connected to vCentre
VLAN Options
- None - Straight-through connected switch
- VLAN - Traditional single VLAN assignment to a port group
- VLAN Trunking - Multiple VLAN's can be assigned to a dv Port Group
- Private VLAN - Allows Private VLANs (see http://en.wikipedia.org/wiki/Private_VLAN)
Service Console ports
Options to create a SC port...
- Add a new Service Console virtual adapter
- Migrate an existing SC adapter to a dvPort Group or dvPort
Configure VMware ESX/ESXi Management Network
Configure ESX/ESXi Storage
Configure FC SAN Storage
Storage Device Naming
- Name - A friendly name based on storage type and manufacturer. User changeable, kept consistent across ESX's
- Identifier - Globally unique, human unintelligible. Persistent through reboot and consistent across ESX's
- Runtime Name - The first path to a device, created by host and unpersistent of format
vmhba#:C#:T#:L#
- vmhba - Storage Adapter number
- C - Storage Channel number (software iSCSI uses this to represent multiple paths to same target)
- T - Target
- L - LUN (provided by storage system; if only 1 LUN its always L0)
PSA - Pluggable Storage Architecture
- Manages storage multipathing
- Allows simultaneous operation of multiple multipathing plugins (MPPs)
- Native Multipathing Plugin (NMP) provided by default, can have sub-plugins (can be either VMware or 3rd party)
- Storage Array Type Plugin (SATP) - unique to a particular array (effectively an array driver, like a standard PC hardware driver)
- Path Selection Plugin (PSP) - default assigned by NMP based on the SATP
- Multipathing Plugin (MPP) - 3rd party, can run alongside or in addition to Native Multipathing Plugin,
PSA operations
- Loads and unloads multipathing plugins
- Hides VM specifics from a particular plugin
- Routes I/O requests for a specific logical device to the MPP managing that device
- Handles I/O queuing to the logical devices
- Implements logical devices bandwidth between VM's
- Handles I/O queueing to the physical storage HBA's
- Handles physical path discovery and removal
- Provides logical device and physical path I/O stats
MPP / NMP operations
- Manage physical path (un)claiming
- Manage creation, and (de)registration of logical devices
- Associate physical paths with logic volumes
- Process I/O requests to logical devices
- Select an optimal physical path for the request
- Depending on storage device, perform specific actions necessary to handle path failures and I/O cmd retries
- Support management tasks, EG abort or reset of logical devices
PSP types
Default (VMware) PSP Types (3rd party PSP's can be installed)...
- Most Recently Used - Good for either Active/Active or Active/Passive
- Fixed - Can cause path thrashing when used with Active/Passive
- Round Robin - Load balanced
PSA Claim Rules
Used to define paths should be used by a particular plugin module
LUN Masking
Used to prevent an ESX from seeing LUN's or using individual paths to a LUN
Add and load a claim rule to apply
Configure iSCSI SAN Storage
Most of the FC SAN Storage info above is also applicable here
Configure NFS Datastores
- ESX's manage exclusive access to files via
.lc-XXX
lock files
Configure and Manage VMFS Datastores
- VMFS Datastore capacity can be increased on the fly whilst VM's are running (from that datastore)
Install and Configure vCenter Server
Install vCenter Server
Minimum Requirements
- 2x CPU's (2GHz)
- 3GB RAM
- 2GB disk
- Microsoft SQL2005 Express
Scale
VC
CPU
Memory
50 ESXs, 250 VMs
32 bit
2
4 GB
200 ESXs, 2000 VMs
64 bit
4
4 GB
300 ESXs, 3000 VMs
64 bit
4
8 GB
- Database must be 32bit only, regardless of VC's OS (default database on 64bit SQL is 64bit)
Manage vSphere Client plug-ins
Plug-In
Description
Update Manager
Converter Enterprise
vShield Zones
App aware firewall, inspects client-server and inter-VM traffic to provide traffic analysis and app-aware firewall partitioning
Orchestrator
Workflow engine to manage automated tasks/workflows
Data Recovery
Backup and recovery. Centralised management of backup tasks (inc data de-duplication).
Configure vCenter Server
Guest Customisation Requirements
- Source machine must have
- VMTools installed (latest version)
- Similar OS to intended new machine
- SCSI disks
- (Win) Guest OS cannot be a domain controller
- (Win) Sysprep must be installed on VC
- (Linux) Guest OS must have Perl installed
Configure Access Control
Role
Type
ESX / VC
Description
No Access
System
ESX & VC
No view or do. Can be used to stop permissions propagating.
Read Only
System
ESX & VC
View all except Console, no do.
Administrator
System
ESX & VC
Full rights
VM User
Sample
VC only
VM start/stop, console, insert media (CD)
VM Power User
Sample
VC only
As user plus hardware and snapshot operations
Resource Pool Admin
Sample
VC Only
Akin to an OU admin, full rights for child objects
Cannot create new VM's without additional VM and datastore privileges.
VCB User
Sample
VC Only
Expected to be used by VCB, do not modify!
Datastore Consumer
Sample
VC Only
Allows creation of VMDK's or snapshots in datastore (additional VM privileges to action)
Network Consumer
Sample
VC Only
Allows assignment of VM's to networks (additional VM privileges to action)
Deploy and Manage Virtual Machines and vApps
Create and Deploy Virtual Machines
- VM Hardware v4 runs on ESX3 or ESX4, v7 runs on ESX4 only
- VM's running MS Windows should have SCSI TimoutValue changed to 60 secs to allow Windows to tolerate delayed SAN I/O from path failovers
Disk Types
- Thick - traditional (can convert to Thin via Storage vMotion)
- Thin - minimal space usage (conversion to Thick is manual process)
Memory
- Minimum of 4MB, increments of 4MB
- Maximum for best performance - threshold over which a VM's preformance will be degraded if memory size exceeded (varies dependant on load on ESX)
SCSI Controller Types
- BusLogic Parallel
- LSI Logic SAS
- LSI Logic Parallel
- VMware Paravirtual
- High performance to provide better throughput with lower ESX CPU usage
- Only VM h/ware v7 with Win2k3, Win2k8 or Red Hat Ent v5
- Not supported with
- Boot disks (use a standard adapter for VM's OS/boot disk)
- Record/replay
- Fault Tolerance
- MSCS Clustering (so also SQL clusters)
N-port ID virtualization (NPIV)
- Provides VM's with RDM's unconstrained to an ESX (ie allows VMotion when using RDM's)
- Must be enabled on SAN switch
- ESX's HBA's must support NPIV
- NPIV enabled VM's are assigned 4 NPIV WWN's
vNICs
- Flexible - Becomes VMXNET when on 32bit OS with VMTools installed (VMware optimised), otherwise vLANCE (old AMD LANCE 10MB NIC driver)
- e1000 - Default for 64bit OS's, emulates an Intel E1000 card
- VMXNET2 - Aka enhanced VMXNET, supports jumbo frames and TSO, limited OS support
- VMXNET3 - Performance driver, only supported on VM hardware v7, and limited OS's
vCentre Converter
- Requires the following ports
- Windows: TCP 139, 443, 445, 902
- Linux: TCP 22, 443, 902, 903
Guided Consolidation
- Active Domains - Systems being analysed need to be a member of an active domain
- Add to Analysis to analyse new systems, max 100 concurrent
- Confidence - Degree to which VC collected perf data, and how good a candidate
- High confidence is shown after 24 hrs, if workload varies over greater interval, further analysis is required
- New VM's disk size = Amount used on physical x 1.25
- Convert manually to be able to specify new VM's settings
Manage Virtual Machines
VM hardware can be modified in-flight as long as
- The guest OS supports hot plug (eg Win2008)
- VM hardware version is v7
- vCPU's can only be added if "CPU Hot Plug" is enabled in the VM's options
Virtualized Memory Management Unit (MMU)
- Maintains mapping between VM's guest OS physical memory to underlying hosts machine memory
- Intercepts VM instructions that would manipulate memory, so that CPU's MMU is not updated directly.
Deploy vApps
vApp - An enhanced resource pool to run a contained group of VM's, can be created under the following conditions
- A host is selected in the inventory that is running ESX3 or later
- A DRS-enabled cluster is selected in the inventory
Deploying an OVF template
- Non-OVF format appliances can be converted using the VMware vCentre Converter module
- During deployment IP allocation can be (if OVF templates states this is configurable)
- Fixed
- Transient - VCentre manages a pool of available IP's
- DHCP
Manage Compliance
Install, Configure and Manage VMware vCenter Update Manager
- Update Manager can be installed on VC, recommended separate for large environments
- Requires its own db instance (can be on same server as VC database, recommended separate)
- Requires sysadmin or db_owner role
- VMware vCenter Update Manager Guest Agent is installed to Win or Linux guests on 1st patch scan or remediation run.
- Smart Rebooting - Update manager attempts to adhere to the startup dependencies stated in a vApp config
- Edit
vci-integrity.xml
to change
<patchStore>
- Location of downloaded patches (default - C:\Documents and Settings\All Users\Application Data\VMware\VMware Update Manager\Data\
<PatchDepotUrl>
- URL used by ESX's to access patches (default - Update Manager server)
- Severity Levels
- Not Applicable
- Low
- Moderate
- Important
- Critical
- Host General
- Host Security
Establish and Apply ESX Host Profiles
- Used to ensure consistent configuration across ESX's
- Create a profile from a reference ESX, then apply to Cluster or ESX
- Reference ESX can be changed
- Profile can be refreshed (if reference ESX config has been updated)
- ESX must be in maintenance mode for a profile to be applied (resolve compliance discrepancies)
Establish Service Levels
Create and Configure VMware Clusters
VM Monitoring
- HA monitors VM to detect if they've hung / stopped responding, and resets VM if both
- VM Tools heartbeat lost in interval
- No VM I/O in interval (default 120 secs, reconfig at cluster level
das.iostatsInterval
- Default 60 secs no h/beat, max 3 resets in 24 hrs (High sensitivity 30 secs and 1hr, Low 120 secs and 7 days)
- VM Monitoring should be suspending during network changes
High Availability
- Uses the following networks for HA communication
- ESX - All Service Console networks
- ESXi - All VMkernel networks (not VMotion network if alternatives available)
- Uses highest CPU and Memory reservation to generate a VM slot, which is used for capacity calculations
Distributed Power Management
- Uses current load and VM resource reservation to calculate required number of powered-up ESXs
- ESX power-on achieved by WOL, IPMI or iLO
- IMPI or iLO: Must specify IP, MAC etc for each ESX
- WOL: VMotion NIC must support WOL, and VMotion switchport must be set to Auto (as WOL often not supported by NIC at 1GB)
- Must test ESX in and out of Standby Mode before enabling DPM
Enhanced VMotion Compatibility
- Hides additional CPU features in a cluster (ie features one ESX in a cluster has but another doesn't)
- Requires no VM's to be running on the cluster (as the CPU type will effectively be changed)
- Generally works for similar manufacture make & model CPU's with different stepping levels
Enable a Fault Tolerant Virtual Machine
- vLockstep - Keeps Primary and Secondary VM's in sync
- vLockstep Interval - Time required for secondary to sync with primary (normally < .5s ec)
- Log Bandwidth - Bandwidth required to keep VM's in sync across FT network
- On-Demand Fault Tolerance - Temporary manually managed FT, configured for a VM during a critical time
- Recommenced max of 4 FT VM's per ESX (primary or secondary)
Prerequisites
- Cluster
- HA and host monitoring must be enabled
- Host certificate checking must be enabled
- ESX's
- Separate VMotion and FT Logging NIC(s) configured (should be different subnets for each)
- Same ESX software version and patch level (FT must be temporarily disabled during ESX software upgrades)
- FT-compatible processor
- Host certified by OEM as FT-capable
- Host BIOS must have Hardware Virtualisation (eg Intel VT) enabled
- VM's
- VMDK files must be thick provisioned with Cluster Features enabled
- Run supported OS (generally all, may require reboot to enable FT)
Unsupported
- Snapshots (must be removed/committed before FT enabled)
- Storage VMotion
- DRS
- SMP - Only single vCPU supported
- Physical RDM
- CD-ROM or Floppy media/ISO not on shared storage
- Paravirtualised guests
- NPIV
- NIC Passthrough
Setup
- Enable host certificate checking
- Configure VMkernel networking
- Create HA Cluster and perform Profile Compliance
- Turn on FT for appropriate VM's
Not Protected caused by...
- VM's are still starting up
- Secondary VM is not started, possible causes...
- No suitable host on which start secondary
- A fail-over has occurred but FT network link down, so new secondary not started
- Disabled - FT has been disabled by user or VC (because no suitable secondary host can be found)
- Primary VM is not on, so status is Not Protected, VM not Running
Create and Configure Resource Pools
Nothing new!
Migrate Virtual Machines
- Cold Migration - VM is powered off, can be migrated to another datacentre
- Suspended VM Migration - Config and disk files can be relocated, can be migrated to another datacentre
- VMotion - VM is powered on. Moves VM, config and disk files are static
- Storage VMotion - VM is powered on. VM is static, config and disk files move
VMotion Priority
- High - Resources are reserved on source and destination ESX's prior to move. Move may not proceed.
- Low - No reservation made, just proceeds. Migrations likely to take longer and may cause VM to become unavailable for a period of time
Backup and Restore Virtual Machines
Snapshots
- Can quiesce guest file system (req VMTools) to ensure consistent disk state
- Independent disks are excluded from snapshots (Persistent writes to disk, Nonpersistent writes to redo log, discarded at power off)
- Migrating a VM with Snapshots
- Cannot use Storage VMotion
- All VM files must reside in single directory if being moved by cold storage migration
- Reversion after VMotion may cause VM to fail - only occurs if discrepancies in ESX hardware
VMware Data Recovery
- Built on VMware vStorage API for Data Protection
- Can store backup on any ESX supported virtual disk, or SAN, NAS, or CIFS storage
- All stored in deduplicated store
- Max 8 VM backups can run concurrently
- Max 2 backup destinations used concurrently
- Max 100 VM's per back appliance
VMware Data Recovery Setup
- Install VI Client plugin (needs to be able communicate with backup appliances on TCP 22024)
- Install/import VMware Data Recovery OVF/appliance
- Add VMDK to appliance (to be used as backup destination, network stores can be used, but VMDK's are faster)
Perform Basic Troubleshooting and Alarm Management
Perform Basic Troubleshooting for ESX/ESXi Hosts
- Service Console Networking - Use
esxcfg-vswif, esxcfg-vswitch, esxcfg-nics
- Physical switchport failover - Use PortFast to ensure a VM's MAC appearing on a different switchport is handled quickly
- Port Group Reconfiguration - Renaming a Port Group will mean connected VM's will loose their PortGroup config
- Hardware Health Service - VI Client plugin that uses an IE object to access the info on vCentre
Export Diagnostic Data
To generate a diagnostic data report...
- Run
vm-support
script on ESX
- Run Administrator | Export Diagnostic info on VI Client
Perform Basic Troubleshooting for VMware FT and Third-Party Clusters
Unexpected FT Failovers
- Partial Hardware Failure Related to Storage - Caused by one ESX experiencing problems accessing VM's storage
- Partial Hardware Failure Related to Network - Caused by FT logging NIC being congested or down
- Insufficient Bandwidth on the Logging NIC Network - Caused by too many FT VM's on the same ESX
- VMotion Failures Due to Virtual Machine Activity Level - VM is too active for VMotion to succeed
- Too Much Activity on VMFS Volume Can Lead to Virtual Machine Failovers - Too many file system locking operations (VM power on/off's etc)
- Lack of File System Space Prevents Secondary VM Startup
Other FT Errors
- Hardware Virtualization Must Be Enabled - HV (ie VT/AMD-V) must be enabled to allow FT
- Compatible Secondary Hosts Must Be Available - No spare ESX's with HV, capacity, not in Maintenance mode etc
- Very Large Virtual Machines Can Prevent Use of Fault Tolerance - If memory is large (>15GB) or changing too much, VMotion will not be able to keep in sync, can increase time-out value (def 8 sec -> 30 secs)
ft.maxSwitchoverSeconds = "30"
entered in VM's VMX file
- Secondary VM CPU Usage Appears Excessive - Replaying some events can be more expensive than recording on Primary, normal operation
Perform Basic Troubleshooting for Networking
Perform Basic Troubleshooting for Storage
Perform Basic Troubleshooting for HA/DRS and VMotion
Create and Respond to vCenter Connectivity Alarms
Create and Respond to vCenter Utilization Alarms
Monitor vSphere ESX/ESXi and Virtual Machine Performance