Virtual Machines: Difference between revisions

From vwiki
Jump to navigation Jump to search
m (→‎Procedure: Added link to export probs section)
(Removed GoogleAdBanner)
 
(53 intermediate revisions by the same user not shown)
Line 1: Line 1:
{{Depreciated|category=Virtual Machine}}
== Basic Virtual Machine Tasks ==
== Basic Virtual Machine Tasks ==
=== Start / Stop / Bounce a VM ===
=== Start / Stop / Bounce a VM ===
Line 37: Line 39:
==== To create an ISO image ====
==== To create an ISO image ====
You'll need to download an ISO creator, there are many freeware utilities available, however one that's tried and tested is [http://isorecorder.alexfeinman.com/isorecorder.htm ISORecorder].  Generally you can create ISO images from both a physical CD, or just the contents of a folder (if you have ISORecorder installed, right-click over the disk or folder and select "Create ISO image")
You'll need to download an ISO creator, there are many freeware utilities available, however one that's tried and tested is [http://isorecorder.alexfeinman.com/isorecorder.htm ISORecorder].  Generally you can create ISO images from both a physical CD, or just the contents of a folder (if you have ISORecorder installed, right-click over the disk or folder and select "Create ISO image")


=== Change Network Connection ===
=== Change Network Connection ===
Line 47: Line 48:
# Hightlight the appropriate '''Network Adapter''', and select the new Network Connection
# Hightlight the appropriate '''Network Adapter''', and select the new Network Connection
# Change takes effect as soon as '''OK''' is hit
# Change takes effect as soon as '''OK''' is hit


=== Add an Additional Network Connection ===
=== Add an Additional Network Connection ===
Line 60: Line 60:
# Select the appropriate network connection and hit '''Next''', and then '''Finish'''
# Select the appropriate network connection and hit '''Next''', and then '''Finish'''
# [[#Start / Stop / Bounce a VM|Power on the virtual machine]]  
# [[#Start / Stop / Bounce a VM|Power on the virtual machine]]  


=== Change Physical Memory / CPU's Allocation ===
=== Change Physical Memory / CPU's Allocation ===
# Shut down the Application and OS of the virtual machine
# Shut down the Application and OS of the virtual machine
# Log into the Virtual Infrastructure - [[#Management Access|Management Access]]
# Log into the Virtual Infrastructure - [[#Management Access|Management Access]]
Line 72: Line 70:
# Apply the change by hitting '''OK'''
# Apply the change by hitting '''OK'''
# [[#Start / Stop / Bounce a VM|Power on the virtual machine]]
# [[#Start / Stop / Bounce a VM|Power on the virtual machine]]
=== Install/Upgrade VM Tools on Citrix/Terminal Services VM ===
In a Terminal Services VM you can't install VM Tools in the normal automated fashion.
# Go into install mode
#* <code>change user /install</code>
# Install/upgrade VM Tools as normal
# Reboot, if you don't reboot you must go back into normal exec mode
#* <code>change user /exec</code>


== Config Settings ==
== Config Settings ==
=== Performance Settings ===
Additionally, see the following resources
* http://www.vmguru.nl/wordpress/2010/07/how-to-optimize-guests-for-vmware-view/
* http://www.vmware.com/Files/pdf/VMware-View-OptimizationGuideWindows7-EN.pdf
==== Windows 2003 ====
In order to be able to squeeze the most of a virtual infrastructure its imperative that VM's are configured for best performance.
* '''System Performance''' - Set the OS for best performance
** Go to to System Properties | Advanced
*** Visual Effects - Disable/Adjust for best performance
*** Advanced - Tailor for the application/service the server is running
*** Virtual memory/page file - Should be set to a fixed size, and ideally as small as possible so as to conserve disk space usage.  You need to know what you're doing when you set this.  Do you expect your systems to be starved of memory and need to page?  Personally for a VM with 4GB RAM assigned, I'd keep the page file down to 1GB in size.
==== Windows XP ====
The following changes are performance based, intended to ensure XP isn't unnecessarily wasteful
# Display simplifications
#* Right-click over the '''desktop''' and got to '''Properties'''
#*# Switch to Classic theme
#*# Disable screen-saver
#*# Disable monitor power save
# Disable Indexing Services
#* Go to '''Control Panel | Add or Remove Programs | Add/Remove Windows Components''' and uncheck '''Index services'''
# Disable paging of the executive
#* <code> HKLM\SYSTEM\CurrentControlSet\Control\Session Manager\Memory Management</code> set <code>DisablePagingExecutive</code> to <code>1</code>
# Set for best performance
#* My Computer | Properties go to Advanced Tab, Performance Settings, and '''Adjust for best performance'''
# Disable all sounds
# Defragment prefetch
#* <code> defrag c: -b </code>
# Turn off disk perf counters
#* <code> diskperf -n </code>
==== Windows 7 ====
# Set for best performance
#* Go to Control Panel\All Control Panel Items\Performance Information and Tools, then Adjust visual effects, and select '''Adjust for best performance'''
# Disable IPv6 on NIC
# Disable sounds
# Change power plan to '''High Performance''' and change ''Turn off the display'' to '''Never'''
# Disable the following services
#* BitLocker Drive Encryption Service
#* Block Level Backup Engine Service
#* Desktop Window Manager Session Manager
#* Diagnostic Policy Service
#* HomeGroup Listener
#* HomeGroup Provider
#* IP Helper
#* Microsoft iSCSI Initiator Service
#* Microsoft Software Shadow Copy Provider
#* Secure Socket Tunneling Protocol Service
#* Security Center
#* SSDP Discovery
#* Superfetch
#* Tablet PC Input Service
#* Themes
#* UPnP Device Host
#* Volume Shadow Copy
#* Windows Backup
#* Windows Firewall
#* Windows Media Center Receiver Service
#* Windows Media Center Scheduler Service
#* Windows Search
#* WLAN AutoConfig
=== Disable Shutdown Event Tracker ===
=== Disable Shutdown Event Tracker ===
If the ESX servers are running as a HA cluster then they MUST be able to fully startup automatically after a re-boot. The Windows OS Shutdown tracker asks why you're shutting down or rebooting a system, or following an unexpected shutdown, halts the starting of a system pending information from the user.  Not a problem for servers where all applications run as a service, but would impede VMware HA operating effectively where (GUI) applications need to start by stopping systems being restarted fully.
If the ESX servers are running as a HA cluster then they MUST be able to fully startup automatically after a re-boot. The Windows OS Shutdown tracker asks why you're shutting down or rebooting a system, or following an unexpected shutdown, halts the starting of a system pending information from the user.  Not a problem for servers where all applications run as a service, but would impede VMware HA operating effectively where (GUI) applications need to start by stopping systems being restarted fully.
Line 91: Line 161:
# Specify the low extensions as <code> .bat;.exe </code>
# Specify the low extensions as <code> .bat;.exe </code>


== Increase Disk Size ==
=== Console Clipboard Integration ===
In the good old days this was always enabled by default, so you could copy and paste between your desktop and VM's console.  Since ESX 4.1, its been disabled by default.  There are two routes in order to be able to use again
 
* '''Enable for VM''' - Permanent, requires VM downtime
*# Shutdown the VM
*# Go to '''Edit Settings''', '''Options''' tab, then Advanced / General and hit '''Configuration Parameters'''
*# Use '''Add Row''' to add the following new config items
*#* Set <code>isolation.tools.copy.disable</code> to <code>false</code>
*#* Set <code>isolation.tools.paste.disable</code> to <code>false</code>
*# Restart the VM
* '''Enable for VM's on an ESX''' - Temporary (won't survive ESX software upgrade), VM's can be vMotioned on to allow clipboard integration funkiness
*# SSH to the ESX and then edit the <code> /etc/vmware/config </code> file, appending the following...
*#* <code>isolation.tools.copy.disable="FALSE"</code>
*#* <code>isolation.tools.paste.disable="FALSE"</code>
 
=== Disable Balloon Driver ===
'''You should not disable the balloon driver - there are generally other ways to achieve whatever it is you're trying to achieve'''
 
Certain applications (such as MS SQL and Java) can lock the amount of 'physical' memory they want, and stop the OS being able to page it out to disk, in order tyo protect that applications performance.  However, if a virtual environment becomes memory constrained, VMTools may attempt to force the OS to page memory to disk, but the OS will be unable to do so, and so will page itself out to disk, impacting the entire machines performance.
 
By using memory reservations its possible to ring-fence physical RAM for a machine, meaning that if contention for memory occurs, the ESX will prioritise on other virtual machines 1st in order to reduce overall physical memory usage.  Ideally, some dynamic memory should still be left, eg allocate a VM with 4 GB of RAM with 3 GB reserved, 3GB being enough to allow the OS and memory locking application to stay in RAM, with an additional 1GB available to be used should there be available capacity.
 
To disable the balloon driver
# Shutdown the VM
# Go to '''Edit Settings''', '''Options''' tab, then Advanced / General and hit '''Configuration Parameters'''
# Use '''Add Row''' to add a new config item
# Set <code>sched.mem.maxmemct</code> to <code>0</code>
# Restart the VM
 
'''<code>sched.mem.maxmemct</code>''' configures the amount of memory (in MB) that balloon driver can expand to in order to conserve ESX RAM usage.
 
== Files Information ==
{|cellpadding="4" cellspacing="0" border="1"
|- style="background-color:#bbddff;"
! File                      !! Purpose            !! Notes
|-
| <code> *.vmx </code>      || VM config          || Contains the full config of the virtual machine
|-
| <code> *.vmsd </code>      || Snapshot metadata  ||
|-
| <code> *.vmsn </code>      || Snapshot state      || Stores delta info that wouldn't be otherwise written to VMDK file (eg Power State, Hardware changes, etc)
|-
| <code> *.vmss </code>      || Suspend state      || Created when a VM is suspended to contain its ''physical'' memory contents
|-
| <code> *.vmxf </code>      || Team config        ||
|-
| <code> *.vmdk </code>      || Virtual hard-drive  ||
|-
| <code> *.nvram </code>    || vBIOS              || Can be deleted, gets recreated on VM start (BIOS settings will be defaulted)
|-
| <code> *.vswp </code>      || VM memory swap      || Can be deleted, but you need to remove the reference to it from the VMX file
|-
| <code> *.hlog </code>      || VMotion helper      || Can be deleted, as long as there's not a VMotion in progress
|}
 
== Procedures ==
=== Creating MSCS Machines ===
There are various different configuration options for creating MS clusters.  For production standard clusters you must use RDM, otherwise you're forced to host both your VM's on the same ESX (and you can't vMotion them!).
* You can't snapshot disks that are configured to use SCSI Bus Sharing.
 
{|cellpadding="4" cellspacing="0" border="1"
|- style="background-color:#bbddff;"
! Disk Type        !! SCSI Bus Sharing !! Cluster in a Box<br>2 VM's / 1 fixed ESX  !! Cluster Across Boxes<br>2 VM's / multi ESX  !! Virtual/Physical Hybrid<br>1 VM + 1 Physical
|-
| '''Normal VMDK''' || '''Virtual'''    || '''Yes'''            || No                  || No
|-
| '''RDM'''        || '''Virtual'''    || Yes                  || Yes - Win2k3 only  || No
|-
| '''RDM'''        || '''Physical'''  || No                    || '''Yes'''          || Yes
|}
* Window 2003 servers must use LSI Logic Parallel SCSI controller for shared disks
* Window 2008 servers must use LSI Logic SAS SCSI controller for shared disks
 
See the following for further info
* [http://kb.vmware.com/kb/1004617 VMware KB 1004617 - Microsoft Cluster Service (MSCS) support on ESX] - Links to PDF docs for each flavour of ESX
* [http://kb.vmware.com/kb/1037959 VMware KB 1037959 - Microsoft Clustering on VMware vSphere: Guidelines for Supported Configurations] - Supported infrastructure config matrices
 
==== Cluster in a Box ====
Procedure assumes you're creating a VM cluster using 2 VM's on the same server sharing standard VMDK disks (not RDM's)
# Create a new private vSwitch on the ESX to host the VM's called "MSCS Heartbeat"
# Create two VM's...
#* Don't create the shared disks yet
## With 2 NIC's, 1st attached to normal (externally accessible) network, 2nd attached to private "MSCS Heartbeat" network
## Boot up and ensure they're working as expected (not as a cluster yet), then shutdown
# On the 1st VM...
## Create the required shared disks on a new SCSI Bus ID
##* Tick '''Support clustering features such as Fault Tolerance''', which ensures the disks are created in eagerzeroedthick format
##* Select a new SCSI Bus ID in the '''Virtual Device Node''' drop-down box, which creates the disks on a new SCSI Controller
## Change the new SCSI Controller's config...
##* '''SCSI Controller Type''' should be set to '''LSI Logic SAS''' (Win2k8) or LSI Logic Parallel (Win2k3)
##* '''SCSI Bus Sharing''' should be set to '''Virtual'''
# On the 2nd VM...
## Create the required shared disks on a new SCSI Bus ID, selecting '''Use an existing virtual disk''' (you'll need to locate the shared disks already created)
## Change the SCSI Bus Sharing mode of the new SCSI Controller to Virtual
# Boot the VM's up, disks should be visible from Disk Management from both VM's
#* Only format from one machine, NTFS doesn't support access from more than one host, MSCS needs to manage volume access/ownership
 
=== Increase Disk Size ===
Increasing the virtual disk size provided to a VM is straight forward (though be aware that snapshots need to be deleted 1st, if any exist)...
Increasing the virtual disk size provided to a VM is straight forward (though be aware that snapshots need to be deleted 1st, if any exist)...
# Go into the VM's settings
# Go into the VM's settings
Line 99: Line 266:
The trick is to extend the logical partition within the OS.  Depending on the original partition type and the OS, the options vary.
The trick is to extend the logical partition within the OS.  Depending on the original partition type and the OS, the options vary.


=== Increase Logical Partition ===
''In-case of problems, see - [[Virtual_Machines#Can't Increase a VM's Disk|Can't Increase a VM's Disk]], or [[Windows_2008#Extend_Partition_Fails|Extend Partition Fails]]
 
==== Increase Logical Partition ====
Generally boot or system disks cannot be extended whilst the OS is up, whereas normal data disk can be in later OS's, but this is still not ideal.  Its generally most reliable to plan for system down time, and use a utility to extend the partition whilst its offline. Especially in a virtual environment there is no excuse for not making a backup of the partition 1st.  
Generally boot or system disks cannot be extended whilst the OS is up, whereas normal data disk can be in later OS's, but this is still not ideal.  Its generally most reliable to plan for system down time, and use a utility to extend the partition whilst its offline. Especially in a virtual environment there is no excuse for not making a backup of the partition 1st.  


Line 111: Line 280:
| System      || Either  || Cannot be extended
| System      || Either  || Cannot be extended
|-
|-
| Data        || Basic  || Cannot be extended, can convert to Dynamic, but this will require a brief IO interruption.
| Data        || Basic  || Cannot be extended, can convert to Dynamic, but this will require at least a brief IO interruption, but up to two reboots!
|-
|-
| Data        || Dynamic || Can be extended on the fly, but a new volume is tagged onto the end of the existing partition to create a larger one made up of two volumes
| Data        || Dynamic || Can be extended on the fly, but a new volume is tagged onto the end of the existing partition to create a larger one made up of two volumes
Line 129: Line 298:
# Turn off snapshotting
# Turn off snapshotting


=== VM's With Lots Of Disks ===
==== VM's With Lots Of Disks ====
It can be very difficult to identify the correct disk within VMware to increase when a VM has a large number of VMDK's.
It can be very difficult to identify the correct disk within VMware to increase when a VM has a large number of VMDK's.


Line 137: Line 306:
* Windows drive letters are useless, never assume D: is disk 2 for example
* Windows drive letters are useless, never assume D: is disk 2 for example


== Rename a VM ==
=== Rename a VM ===
'''''Renaming a virtual machine just by right-clicking over the machine and renaming does not alter the underlying file and folder names.'''''  To ensure that these changes take place you must move the VM to another datastore, ie
'''''Renaming a virtual machine just by right-clicking over the machine and renaming does not alter the underlying file and folder names.'''''   
 
This is much easier than it was since the advent of storage vMotion.  Bear in mind that the vCenter uses the VM's current name in order to determine the naming of its files/folders at it's destination, so to completely rename a VM...
# Rename the VM within its OS and restart
# Rename the VM within vCentre
# Storage vMotion the VM (can be moved back after)
 
 
'''''Legacy methods...'''''
 
To ensure that these changes take place you must move the VM to another datastore, ie
# Shutdown the VM
# Shutdown the VM
# Rename the VM in vCenter
# Rename the VM in vCenter
Line 155: Line 334:
The above was taxed from http://www.yellow-bricks.com/2008/02/10/howto-rename-a-vm/
The above was taxed from http://www.yellow-bricks.com/2008/02/10/howto-rename-a-vm/


== Clone a VM ==
VMware have now published an article on this [http://kb.vmware.com/kb/1029513 VM KB 1029513]
 
=== Clone a VM ===
This can done as
This can done as
* '''Hot clone''' - Source VM is left running, its disks are quiesced, and cloned.  Can cause problems as new machine behaves as if it was ungracefully shutdown when first started, but normally successful.  Source machine needs to be relatively quiet.
* '''Hot clone''' - Source VM is left running, its disks are quiesced, and cloned.  Can cause problems as new machine behaves as if it was ungracefully shutdown when first started, but normally successful.  Source machine needs to be relatively quiet.
* '''Cold clone''' - Source VM is shutdown 1st, preferable to a warm clone if possible.
* '''Cold clone''' - Source VM is shutdown 1st, preferable to a warm clone if possible.


=== Snapshots and Cloning ===
==== Snapshots and Cloning ====
Snapshots are deleted during a clone, in that cloning a machine that has existing snapshots results in the post-snapshot changes being merged into the new machine.
Snapshots are deleted during a clone, in that cloning a machine that has existing snapshots results in the post-snapshot changes being merged into the new machine.


Line 169: Line 350:
# Power on (you'll get an IP conflict if its on the same portgroup as the original)
# Power on (you'll get an IP conflict if its on the same portgroup as the original)


== Shutdown VM via Service Console ==
=== Shutdown VM via Service Console ===
* To determine state of an Virtual Machine running from the local ESX
* To determine state of an Virtual Machine running from the local ESX
** <code> vmware-cmd /vmfs/volumes/SAN1/ServerA/ServerA.vmx getstate </code>
** <code> vmware-cmd /vmfs/volumes/SAN1/ServerA/ServerA.vmx getstate </code>
** <code> getstate() = on </code>
** <code> getstate() = on </code>
* Shutdown a Virtual Machine running from the local ESX gracefully
** <code> vmware-cmd /vmfs/volumes/SAN1/ServerA/ServerA.vmx stop trysoft </code>
** <code> stop(hard) = 1 </code>
* Shutdown a Virtual Machine running from the local ESX forcefully
* Shutdown a Virtual Machine running from the local ESX forcefully
** <code> vmware-cmd /vmfs/volumes/SAN1/ServerA/ServerA.vmx stop hard </code>
** <code> vmware-cmd /vmfs/volumes/SAN1/ServerA/ServerA.vmx stop hard </code>
** <code> stop(hard) = 1 </code>
** <code> stop(hard) = 1 </code>


== Upgrade ESX3 to ESX4 ==
The <code> vmware-cmd </code> command isn't available in ESXi, though it is available via the RCLI, in the following format...
=== Preparation ===
* <code> vmware-cmd.pl /path/to/My_VM.vmx start --server MyESX --username root --password "RootPassword"
 
=== Upgrade ESX3 to ESX4 ===
==== Preparation ====
* Clean up the VM
* Clean up the VM
*# Stop any snapshots, and ensure there's no remnant snapshot files (*.vmsd, *-0000x.vmdk, *-delta.vmdk)
*# Stop any snapshots, and ensure there's no remnant snapshot files (*.vmsd, *-0000x.vmdk, *-delta.vmdk)
Line 193: Line 380:
* Shut the VM down
* Shut the VM down


=== Procedure ===
==== Procedure ====
Procedure assumes your migrating machines from a VI3 infrastructure to a new VI4/vSphere infrastructure.
Procedure assumes your migrating machines from a VI3 infrastructure to a new VI4/vSphere infrastructure. Note that you can you VMware Converter to copy machines between vCentre's if preferred.
 
# Export machine as a Virtual Appliance from VI3 infrastructure
# Export machine as a Virtual Appliance from VI3 infrastructure
#* In the VI Client, select the VM and go to '''File | Virtual Appliance | Export...'''
#* In the VI Client, select the VM and go to '''File | Virtual Appliance | Export...'''
Line 200: Line 388:
# Import machine into new vSphere infrastructure
# Import machine into new vSphere infrastructure
#* In the VI Client, select the VM and go to '''File | Deploy OVF Template...''', and select the appropriate options in the resulting wizard
#* In the VI Client, select the VM and go to '''File | Deploy OVF Template...''', and select the appropriate options in the resulting wizard
# Start the virtual machine to verify its survived the export/import then shutdown
# Take a snapshot (if you make an irreversible mistake its quicker to revert to snapshot than reimport)
# Take a snapshot (if you make an irreversible mistake its quicker to revert to snapshot than reimport)
# Check the VM's settings, particularly Guest OS (which sometimes gets set to ''Other'')
# Start the virtual machine, update VM Tools then shutdown
# Upgrade the virtual hardware
# Upgrade the virtual hardware
#* Right-click and select '''Upgrade Virtual Hardware'''
#* Right-click and select '''Upgrade Virtual Hardware'''
Line 207: Line 396:
#* Remove existing network adapters (note the networks they're connected to!), then add the same quota of VMXNET3 adapters (connected to the same networks in the same order)
#* Remove existing network adapters (note the networks they're connected to!), then add the same quota of VMXNET3 adapters (connected to the same networks in the same order)
# Upgrade the SCSI controller - part 1 (if required)
# Upgrade the SCSI controller - part 1 (if required)
#* Add a new temporary disk, on the next bus (eg SCSI node 1:x), and change the type to ''VMware Paravirtual''
#* Add a new temporary disk, on the next bus (eg SCSI node 1:x)
#* Then change the new SCSI controller type to ''VMware Paravirtual''
# Restore network config
# Restore network config
#* Restart VM, and re-apply recorded network config (answer Yes when asked whether to remove duplicate config on non-existent adapter)
#* Restart VM, and re-apply recorded network config (answer Yes when asked whether to remove duplicate config on non-existent adapter)
# Upgrade the SCSI controller - part 2 (if required)
# Upgrade the SCSI controller - part 2 (if required)
#* Shutdown the VM, and remove the temporary disk added, and change the original SCSI controller to ''VMware Paravirtual''. Restart the machine.
#* Shutdown the VM, and remove the temporary disk added, and change the original SCSI controller to ''VMware Paravirtual'' (the other controller will automatically get removed from the config)
#* Restart the machine.
# Delete/Commit the snapshot
# Delete/Commit the snapshot
=== Windows 2008 Install ===
Use VMXNET3 network adapter.  Only use Paravirtual SCSI interface if you're running at least ESX v4.1. You need to boot with the drivers on a floppy
http://www.virtualinsanity.com/index.php/2009/12/01/more-bang-for-your-buck-with-pvscsi-part-2/
=== Convert Hardware v7 to v3 ===
You'll need to download and install VMware Converter Standalone if haven't already got it installed (its free). The local installation will suffice (client-server not require).  Its possible that you could get away with using the inbuilt vCentre version as long as you're not trying to import a v7 VM on ESX4 to a v4 VM on ESX3 (which you probably are!).
# Start up VMware Converter
# Hit the '''Convert machine''' buttton, top left
# On the resultant ''Source System'' page...
#* Change ''Select source type:'' to '''VMware Infrastructure virtual machine'''
#* Enter login details for the vCentre or ESX your v7 VM is on
# On the ''Source Machine'' page locate your v7 VM
# On the ''Destination System'' page, enter the login details of the vCentre or ESX where you want your v4 VM to be
# On the ''Destination Virtual Machine'' page, edit the VM name (if required) and select the destination folder
#* If you're migrating to an vSphere VC/ESX you must set the Virtual Machine Version on this page
# On the ''Destination Location'' page select an appropriate datastore
# On the ''Options''' you can make any tweaks you might want to
# Finally confirm
# Once the machine is imported, boot it up to ensure all is OK
#* VM Tools will need to explicitly uninstalled and then reinstalled
#* Especially if VM is a template, the OS may want to adjust its drivers given that it'll be running on different (virtual) hardware)


== Troubleshooting ==
== Troubleshooting ==
See also [[Virtual_Centre#Troubleshooting|Virtual Centre Troubleshooting]]
See also [[Virtual_Centre#Troubleshooting|Virtual Centre Troubleshooting]]
If all else fails you can always raise a [[VMware Service Request]]


=== Can't Connect to VM Console ===
=== Can't Connect to VM Console ===
'''Error connecting: Cannot connect to host...'''
'''Error connecting: Cannot connect to host...''' or '''Can't connect to MKS...'''
* This is caused by a TCP connection failure to the ESX server the VM is hosted on. Using telnet or a port test utility, confirm you can connect on both TCP 902 and 903 from your machine to the ESX server.  
* This is caused by a TCP connection failure to the ESX server the VM is hosted on. Using telnet or a port test utility, confirm you can connect on both TCP 902 and 443 from your machine to the ESX server.
* If the problem is affecting a single ESX that previously worked, [[ESX#VMware_Management_Agent_Restart|restart the management services]] on that ESX


=== Can't Deploy VM ===
=== Can't Deploy VM ===
Line 227: Line 444:
'''A general system error occurred: Failed to create journal file provider'''
'''A general system error occurred: Failed to create journal file provider'''
* Check ESX disks are not full
* Check ESX disks are not full
'''Customization of the guest operating system 'winLonghornGuest' is not supported in this configuration. Microsoft Vista (TM) and Linux guests with Logical Volume Manager are supported only for recent ESX host and VMware Tools versions.'''
* Caused by you trying to deploy a guest customised Windows 2008 template, where the OS of the source template is set to Windows 2008(!).  Essentially Win2008 is only barely supported in ESX3.5.  Setting the source machine to '''Vista''' should resolve this issue.
* With Windows 2008 R2 templates the above fix has been seen to not work, in which case
*# Deploy a clone (with no guest customisation)
*# Perform a [[Windows_2008#Sysprep|Sysprep]]


=== Can't Start VM ===
=== Can't Start VM ===
'''HA Admission Control'''
==== HA Admission Control ====
* Can't start VM as doing so wouldn't leave enough failover capacity in order to be able to restart failed VM's should an ESX fail.  Options are to
* Can't start VM as doing so wouldn't leave enough failover capacity in order to be able to restart failed VM's should an ESX fail.  Options are to
** Reduce resource usage of VM's that are already running
** Reduce resource usage of VM's that are already running
Line 238: Line 461:
*# Run the '''Reconfigure for HA''' command, this will re-install the HA agent on the ESX
*# Run the '''Reconfigure for HA''' command, this will re-install the HA agent on the ESX


'''Failed to relocate virtual machine'''
==== Failed to relocate virtual machine ====
* DRS is attempting to relocate a VM at power up, and this relocation failing
* DRS is attempting to relocate a VM at power up, and this relocation failing
** Reattempt to power on machine
** Reattempt to power on machine
** Manually migrate to a less loaded ESX and reattempt power on
** Manually migrate to a less loaded ESX and reattempt power on


'''Access to VMFS storage'''
==== Access to VMFS storage ====
* ESX may have lost connectivity to VMFS partition on which VM resides
* ESX may have lost connectivity to VMFS partition on which VM resides


'''VMFS full'''
==== VMFS full ====
* If VMFS is full, the ESX won't be able to write to the VM's logs when it starts it up, causing VM start-up to fail
* If VMFS is full, the ESX won't be able to write to the VM's logs when it starts it up, causing VM start-up to fail


'''ESX licensing'''
==== ESX licensing ====
* Either ESX isn't licensed, or has lost contact with the license server (VI3) for a long period of time
* Either ESX isn't licensed, or has lost contact with the license server (VI3) for a long period of time


'''Waiting for question to be answered'''
==== Waiting for question to be answered ====
* Generally after changes (such as cold migrations or new deployments), a VM may need to have a question answered before it can continue to power on
* Generally after changes (such as cold migrations or new deployments), a VM may need to have a question answered before it can continue to power on


'''Could not power on VM: No swap file. Failed to power on VM'''
==== Could not power on VM: No swap file. Failed to power on VM ====
* The ESX your starting the VM up on can't get proper access the VM's files, either because
* The ESX you're starting the VM up on can't get proper access the VM's files, either because
** The VM is already powered up on another ESX
** The VM is already powered up on another ESX
** The VM's files have been corrupted
** The VM is already powered up (but shows as down on the VI Client)
* If the ESX the virtual is/was on has ''failed'' then its likely that only the ESX's network connections have failed, the virtual machines are still running on the ESX, but are isolated from the network.   
** The VM's files have been corrupted / locked
*# To cause a full HA failover, pull the power cables out of the ESX to kill it completely
 
*# Alternatively, attempt to restore network connectivity to allow the VM's to br reachable again
 
* If there are no ESX failures, then the VM's files are probably corrupted.  The VM needs to be re-registered by removing and re-adding it to the inventory.
# '''Is the VM actually powered off?'''
#* If the VM responds to ping and RDP/VNC/SSH etc (as appropriate) then proceed to [[#VM is Powered On, but appears Powered Off|VM is Powered On, but appears Powered Off]]
# '''Has an ESX recently failed?'''
#* If the ESX the virtual is/was on has recently ''failed'' and HA's isolation response is set to leave powered-on then its possible that only the ESX's network connections have failed, and the virtual machines are still running on the ESX, but are isolated from the network.   
#** To cause a full HA failover, pull the power cables out of the ESX to kill it completely
#** Alternatively, attempt to restore network connectivity to allow the VM's to be reachable again
#* If the ESX the virtual is/was on has recently ''failed'' its possible that the file lock times have not yet expired (or are being kept updated).
#** If you're able to get a console onto the failed ESX, ensure it has fully failed (powered off or PSOD).  If not, power it off to ensure its not failed enough to stop VM's running, but not enough to stop updating the file locks.  HA will restart the VM if its still a very recent failure, else restart the VM manually.
 
If there have been no ESX failures, then the VM's files may be corrupted.  The VM can be re-registered by removing and re-adding it to the inventory, but the re-add may fail if the wrong files are corrupted. To investigate corruption further...
 
* To test whether the ESX should be able to lock the VM's files use <code> touch </code>. Within the VM's directory, do <code> touch *.vswp </code>
** If success, retry power on
** If <code> device or resource busy </code> then the VM is probably owned by another ESX - find that ESX!
** If <code> Invalid argument </code> then the file can't be accessed (eg corrupt or other storage problem)
* Its also worth doing a <code> touch </code> on the following files, if they are not inaccessible then the VM may be recoverable.  To work-around the <code> .vswp </code> issue, remove the reference to the file in the <code> .vmx </code> config file
** <code> touch *.vmx </code>
** <code> touch *flat.vmdk </code>
** <code> touch *delta.vmdk </code>
** <code> touch vmware.log </code>
 
For further info see - [http://kb.vmware.com/kb/10051 VMware KB10051 - Virtual machine does not power on because of missing or locked files]
 
==== Cannot open the disk '/vmfs/volumes/.../MyVM-000001.vmdk' or one of the snapshot disks it depends on... ====
'''Cannot open the disk '/vmfs/volumes/.../MyVM-000001.vmdk' or one of the snapshot disks it depends on.  Reason: The parent virtual disk has been modified since the child was deleted'''
 
* The ESX can't work out the chain of vmdk's that make up the VM's disks, most likely because
** Snapshot CID chain is corrupted
 
# You need to establish the chain of files, start by looking at the <code> vmx </code> file to work out the top <code> vmdk</code>, then track back through them until you get to the base disk.
#* Any <code> vmdk </code> files not referenced in this chain are erroneous and can be deleted (or better, moved to a temporary sub-folder)
#* Any <code> delta </code> file <= 16MB is effectively empty and can be skipped
# Now display the CID's stored and then work out their correct order
#* <code> grep CID My-VM.vmdk My-VM-00000[1-9].vmdk
# You then need to edit the <code> vmdk </code> files to correct the CID chain
# Start the VM and confirm it's working as expected
# Create a new temporary snapshot, then remove it to clear them up
 
==== General system error occurred... ====
'''A general system error occurred: The system returned an error. Communication with the virtual machine might have been interrupted.'''
 
* This error seems to be generally occurred when the ESX is having trouble launching the VM's processes, sometime because its having trouble reading the VM's VMX file.
** If the problem is erratically effecting one or more VM's, its likely that the ESX's hostd process is struggling a bit - in which case restart the ESX management agents
** If the problem is continually effecting one (or possibly more) VM's, the VM('s) config file may be corrupted, or storage may be experiencing problems.
 
=== Can't Stop / Power-Off a VM ===
This normally occurs because you've lost management (VI Client) access to the ESX, or the ESX doesn't appear to be aware that its running the VM, but it is (so appears ''Inaccessible'' via the VI Client).  If you have access to the VM via the VI Client but can't power off, it'll probably be a permissioning issue.  There is no way to gracefully shutdown a VM without access via the VI Client (or direct access to the VM via RDP, VNC, etc).
 
# SSH to the ESX you believe the VM is running on
# Find the path to the VM's config file
#* EG <code> vmware-cmd -l | grep VM_Name </code>
#* If the VM is not listed, the VM isn't registered to that ESX
# Instruct the ESX to power off the VM using the VMX path already found
#* EG <code> vmware-cmd /path/to/VM_Name.vmx stop </code>
 
If the above fails, you'll need to get a bit more forceful...
# Find the PID of the VM
#* EG <code> ps -auxwww | grep VM_Name </code>
# Kill the VM using the PID found ''(make sure you've got the right PID, you could kill the ESX by mistake!)''
#* EG <code> kill -9 1234 </code>
 
=== VM is Powered On, but appears Powered Off ===
The VM responds to ping and RDP/VNC/SSH etc (as appropriate) but is showing as down in the VI Client.  Also see [[#Confirm VM's Status on ESX|Confirm VM's Status on ESX]]
 
# Restart the management agents on the ESX and recheck
 
If that doesn't improve matters...
# Find the location of the vmx file for the VM (so it can be re-added to the inventory)
# Connect a VI Client to the ESX and unregister the VM (remove from inventory)
# Restart the management agents on the ESX
# Re-add the VM to the inventory
 
If running ESX4i see [http://kb.vmware.com/kb/1033591 VMware KB 1033591 - Virtual machine appears powered off after restarting the management services on the host], but note that...
* vMotion all powered-on VM's off the affected ESX first
* Recover 1 VM at a time, and vMotion it off as soon as it is recovered (it may disappear when recovering the next VM)
* Recovered VM's may end up with a state of ''Unknown'' on vCentre and ESX, in which case, remove from ESX inventory and re-add
* Restart the ESX once all recovered


=== Can't VMotion a VM ===
=== Can't VMotion a VM ===
Line 276: Line 575:
'''VM is connect to CD-ROM/ISO'''
'''VM is connect to CD-ROM/ISO'''
* VMs CD-ROM is connecting to an ISO file via the host ESX, tying it to that ESX
* VMs CD-ROM is connecting to an ISO file via the host ESX, tying it to that ESX
=== Can't Increase a VM's Disk ===
'''A general system error occurred: Internal error'''
* Can be caused by existing snapshots running on a VM
* Check the ESX logs / available disk space etc


=== Can't Snapshot ===
=== Can't Snapshot ===
Line 319: Line 623:
** Sysprep can't customise the machine because it doesn't have administrator rights, this can occur where a DC's users have been offloaded to [[Acronyms#L|LDS]]
** Sysprep can't customise the machine because it doesn't have administrator rights, this can occur where a DC's users have been offloaded to [[Acronyms#L|LDS]]


=== VMTools Automatic Cursor Release Not Working ===
Sometimes the console automatic cursor release (which allows you to seamlessly switch focus from a VM console to your desktop by moving your mouse, avoiding having to use CTRL+ALT) sometimes doesn't work.  Seems to be more common with VM's deployed from templates/cloned from VM's.
To resolve...
# Uninstall VM Tools
# Reboot
# Install VM Tools
# Reboot
=== Confirm VM's Status on ESX ===
The following commands take you through confirming the status of a VM, as determined by the ESX
# Get list of VM's registered to ESX to check ESX believes its hosting the VM
#* <code> vm-support -x </code>
# Get the VM's PID
#* <code> vim-cmd vmsvc/getallvms | grep <VM name> </code>
# Get the state of VM (as the ESX believes)
#* <code> vim-cmd vmsvc/power.getstate <vmid> </code>
# Check if the ESX has any running processes for the VM (in which case its powered on, regardless of the above)
#* <code> ps | grep <VM name></code>
To check that a VM is being locked by the ESX you're on
# Get the lock info for the VM's disk (use the 1st if there's numerous)
#* <code> vmkfstools -D <VM-name>-flat.vmdk </code>
# Pick out the MAC address from the lock info (78e7d192a548 in example below)
# List the NIC info for the ESX
#* <code> esxcfg-vmknic -l </code>


[[Category:VMware]]
Lock [type 10c00001 offset 72968192 v 470, hb offset 3985408
gen 583, mode 1, owner 4d2dcc7b-20fb6d90-2b80-'''78e7d192a548''' mtime 25711553]
Addr <4, 151, 197>, gen 299, links 1, type reg, flags 0, uid 0, gid 0, mode 600
len 37580963840, nb 17688 tbz 0, cow 0, zla 3, bs 2097152

Latest revision as of 13:32, 26 September 2016

This page is now depreciated, and is no longer being updated.
The page was becoming too large - all content from this page, and newer updates, can be found via the Category page link below.

This page and its contents will not be deleted.

See Virtual Machine

Basic Virtual Machine Tasks

Start / Stop / Bounce a VM

  1. Log into the Virtual Infrastructure - Management Access
  2. Under the Inventory button, ensure Hosts and Clusters is ticked
  3. Highlight the VM you want to affect
  4. Either right-click or use the commands in the right hand pane to Power off, Power on, Reset as required

This is the same as using the Power or Reset buttons on the front of a physical server. It's possible to send Windows shut down etc commands to the VM; right click over the VM and select the appropriate Shut Down Guest, Restart Guest command. This tells VM Tools to attempt to perform the required action, obviously open applications etc can inhibit the successful shutdown of an OS.


Remote Console (KVM like) Access

If possible, its preferable to use normal remote access software (eg RDP, or VNC). This ensures that load caused by remote access is contained within the VM, rather than the ESX.

  1. Log into the Virtual Infrastructure - Management Access
  2. Under the Inventory button, ensure Hosts and Clusters is ticked
  3. Highlight the VM you want and either right click Open Console or use the Open Console command in the right hand pane


CD-ROM Access

There are essentially two ways to present a CD-ROM image to a VM, using an ISO image is by far and away the most flexible. Even if you only have a physical CD and expect to use it once, its still recommended that you create an ISO image from the CD and use that instead. The alternative is to put the physical media into the ESX hosting the VM (use Host Device when adding the CD to the VM).

To present an ISO image to a VM

  1. If its not already there, copy the ISO image to an NFS share or other ESX accessible datastore
  2. Log into the Virtual Infrastructure - Management Access
  3. Under the Inventory button, ensure Hosts and Clusters is ticked
  4. Highlight the VM you want to attach the ISO image to
  5. Right-click and select Edit Settings...
  6. Highlight the CD/DVD Drive, and select the Datastore ISO file
  7. Hit Browse and go into the appropriate datastore
  8. Select the required ISO file
  9. Tick the Connected check box
  10. Hit OK, the ISO will be attached to the VM's CDROM drive as if you'd inserted a CD into a physical drive
  • Once you've finished using the ISO, go back into the VM's settings and untick the Connected check box
  • To boot a VM to a CDROM ISO, check the "Connected at power on" checkbox and restart the VM's OS

To create an ISO image

You'll need to download an ISO creator, there are many freeware utilities available, however one that's tried and tested is ISORecorder. Generally you can create ISO images from both a physical CD, or just the contents of a folder (if you have ISORecorder installed, right-click over the disk or folder and select "Create ISO image")

Change Network Connection

In similar fashion to being able to swap over a network cable for a physical server, the network connection of a virtual machine can be changed on the fly

  1. Log into the Virtual Infrastructure - Management Access
  2. Under the Inventory button, ensure Hosts and Clusters is ticked
  3. Highlight the VM you want to change the network connection on
  4. Right-click and select Edit Settings...
  5. Hightlight the appropriate Network Adapter, and select the new Network Connection
  6. Change takes effect as soon as OK is hit

Add an Additional Network Connection

When adding additional network connections to any system you must consider network security, for example no system should ever be given access to both Private and Public networks.

  1. Shut down the Application and OS of the virtual machine
  2. Log into the Virtual Infrastructure - Management Access
  3. Under the Inventory button, ensure Hosts and Clusters is ticked
  4. Highlight the VM you want to add the network connection to
  5. Right-click and select Edit Settings...
  6. Hit the Add... button and select Ethernet Adapter, and hit Next
  7. Select the appropriate network connection and hit Next, and then Finish
  8. Power on the virtual machine

Change Physical Memory / CPU's Allocation

  1. Shut down the Application and OS of the virtual machine
  2. Log into the Virtual Infrastructure - Management Access
  3. Under the Inventory button, ensure Hosts and Clusters is ticked
  4. Highlight the VM you want to change the network connection on
  5. Right-click and select Edit Settings...
  6. Hightlight the appropriate setting, Memory or CPUs, and edit as required.
  7. Apply the change by hitting OK
  8. Power on the virtual machine

Install/Upgrade VM Tools on Citrix/Terminal Services VM

In a Terminal Services VM you can't install VM Tools in the normal automated fashion.

  1. Go into install mode
    • change user /install
  2. Install/upgrade VM Tools as normal
  3. Reboot, if you don't reboot you must go back into normal exec mode
    • change user /exec

Config Settings

Performance Settings

Additionally, see the following resources

Windows 2003

In order to be able to squeeze the most of a virtual infrastructure its imperative that VM's are configured for best performance.

  • System Performance - Set the OS for best performance
    • Go to to System Properties | Advanced
      • Visual Effects - Disable/Adjust for best performance
      • Advanced - Tailor for the application/service the server is running
      • Virtual memory/page file - Should be set to a fixed size, and ideally as small as possible so as to conserve disk space usage. You need to know what you're doing when you set this. Do you expect your systems to be starved of memory and need to page? Personally for a VM with 4GB RAM assigned, I'd keep the page file down to 1GB in size.

Windows XP

The following changes are performance based, intended to ensure XP isn't unnecessarily wasteful

  1. Display simplifications
    • Right-click over the desktop and got to Properties
      1. Switch to Classic theme
      2. Disable screen-saver
      3. Disable monitor power save
  2. Disable Indexing Services
    • Go to Control Panel | Add or Remove Programs | Add/Remove Windows Components and uncheck Index services
  3. Disable paging of the executive
    • HKLM\SYSTEM\CurrentControlSet\Control\Session Manager\Memory Management set DisablePagingExecutive to 1
  4. Set for best performance
    • My Computer | Properties go to Advanced Tab, Performance Settings, and Adjust for best performance
  5. Disable all sounds
  6. Defragment prefetch
    • defrag c: -b
  7. Turn off disk perf counters
    • diskperf -n

Windows 7

  1. Set for best performance
    • Go to Control Panel\All Control Panel Items\Performance Information and Tools, then Adjust visual effects, and select Adjust for best performance
  2. Disable IPv6 on NIC
  3. Disable sounds
  4. Change power plan to High Performance and change Turn off the display to Never
  5. Disable the following services
    • BitLocker Drive Encryption Service
    • Block Level Backup Engine Service
    • Desktop Window Manager Session Manager
    • Diagnostic Policy Service
    • HomeGroup Listener
    • HomeGroup Provider
    • IP Helper
    • Microsoft iSCSI Initiator Service
    • Microsoft Software Shadow Copy Provider
    • Secure Socket Tunneling Protocol Service
    • Security Center
    • SSDP Discovery
    • Superfetch
    • Tablet PC Input Service
    • Themes
    • UPnP Device Host
    • Volume Shadow Copy
    • Windows Backup
    • Windows Firewall
    • Windows Media Center Receiver Service
    • Windows Media Center Scheduler Service
    • Windows Search
    • WLAN AutoConfig

Disable Shutdown Event Tracker

If the ESX servers are running as a HA cluster then they MUST be able to fully startup automatically after a re-boot. The Windows OS Shutdown tracker asks why you're shutting down or rebooting a system, or following an unexpected shutdown, halts the starting of a system pending information from the user. Not a problem for servers where all applications run as a service, but would impede VMware HA operating effectively where (GUI) applications need to start by stopping systems being restarted fully.

To disable...

  1. Start Group Policy Object Editor (Start | Run | gpedit.msc)
  2. Go to Computer Configuration\Administrative Templates\System
  3. Set Display Shutdown Event Tracker to Disabled

Set Low Risk File Types

If mapped drives are being used, .bat and .exe files need to be declared as low risk file types to stop Open file - Security Warning prompts being displayed when trying to run from mapped drives. This is particularly a problem if software is set to auto-start by placing shortcuts in the StartUp directory, as the software won't auto start.

To disable...

  1. Start Group Policy Object Editor (Start | Run | gpedit.msc)
  2. Go to User Configuration\Administrative Templates\Windows Components\Attachment Manager
  3. Set the "Default risk level for file types" to Enabled
  4. Specify the low extensions as .bat;.exe

Console Clipboard Integration

In the good old days this was always enabled by default, so you could copy and paste between your desktop and VM's console. Since ESX 4.1, its been disabled by default. There are two routes in order to be able to use again

  • Enable for VM - Permanent, requires VM downtime
    1. Shutdown the VM
    2. Go to Edit Settings, Options tab, then Advanced / General and hit Configuration Parameters
    3. Use Add Row to add the following new config items
      • Set isolation.tools.copy.disable to false
      • Set isolation.tools.paste.disable to false
    4. Restart the VM
  • Enable for VM's on an ESX - Temporary (won't survive ESX software upgrade), VM's can be vMotioned on to allow clipboard integration funkiness
    1. SSH to the ESX and then edit the /etc/vmware/config file, appending the following...
      • isolation.tools.copy.disable="FALSE"
      • isolation.tools.paste.disable="FALSE"

Disable Balloon Driver

You should not disable the balloon driver - there are generally other ways to achieve whatever it is you're trying to achieve

Certain applications (such as MS SQL and Java) can lock the amount of 'physical' memory they want, and stop the OS being able to page it out to disk, in order tyo protect that applications performance. However, if a virtual environment becomes memory constrained, VMTools may attempt to force the OS to page memory to disk, but the OS will be unable to do so, and so will page itself out to disk, impacting the entire machines performance.

By using memory reservations its possible to ring-fence physical RAM for a machine, meaning that if contention for memory occurs, the ESX will prioritise on other virtual machines 1st in order to reduce overall physical memory usage. Ideally, some dynamic memory should still be left, eg allocate a VM with 4 GB of RAM with 3 GB reserved, 3GB being enough to allow the OS and memory locking application to stay in RAM, with an additional 1GB available to be used should there be available capacity.

To disable the balloon driver

  1. Shutdown the VM
  2. Go to Edit Settings, Options tab, then Advanced / General and hit Configuration Parameters
  3. Use Add Row to add a new config item
  4. Set sched.mem.maxmemct to 0
  5. Restart the VM

sched.mem.maxmemct configures the amount of memory (in MB) that balloon driver can expand to in order to conserve ESX RAM usage.

Files Information

File Purpose Notes
*.vmx VM config Contains the full config of the virtual machine
*.vmsd Snapshot metadata
*.vmsn Snapshot state Stores delta info that wouldn't be otherwise written to VMDK file (eg Power State, Hardware changes, etc)
*.vmss Suspend state Created when a VM is suspended to contain its physical memory contents
*.vmxf Team config
*.vmdk Virtual hard-drive
*.nvram vBIOS Can be deleted, gets recreated on VM start (BIOS settings will be defaulted)
*.vswp VM memory swap Can be deleted, but you need to remove the reference to it from the VMX file
*.hlog VMotion helper Can be deleted, as long as there's not a VMotion in progress

Procedures

Creating MSCS Machines

There are various different configuration options for creating MS clusters. For production standard clusters you must use RDM, otherwise you're forced to host both your VM's on the same ESX (and you can't vMotion them!).

  • You can't snapshot disks that are configured to use SCSI Bus Sharing.
Disk Type SCSI Bus Sharing Cluster in a Box
2 VM's / 1 fixed ESX
Cluster Across Boxes
2 VM's / multi ESX
Virtual/Physical Hybrid
1 VM + 1 Physical
Normal VMDK Virtual Yes No No
RDM Virtual Yes Yes - Win2k3 only No
RDM Physical No Yes Yes
  • Window 2003 servers must use LSI Logic Parallel SCSI controller for shared disks
  • Window 2008 servers must use LSI Logic SAS SCSI controller for shared disks

See the following for further info

Cluster in a Box

Procedure assumes you're creating a VM cluster using 2 VM's on the same server sharing standard VMDK disks (not RDM's)

  1. Create a new private vSwitch on the ESX to host the VM's called "MSCS Heartbeat"
  2. Create two VM's...
    • Don't create the shared disks yet
    1. With 2 NIC's, 1st attached to normal (externally accessible) network, 2nd attached to private "MSCS Heartbeat" network
    2. Boot up and ensure they're working as expected (not as a cluster yet), then shutdown
  3. On the 1st VM...
    1. Create the required shared disks on a new SCSI Bus ID
      • Tick Support clustering features such as Fault Tolerance, which ensures the disks are created in eagerzeroedthick format
      • Select a new SCSI Bus ID in the Virtual Device Node drop-down box, which creates the disks on a new SCSI Controller
    2. Change the new SCSI Controller's config...
      • SCSI Controller Type should be set to LSI Logic SAS (Win2k8) or LSI Logic Parallel (Win2k3)
      • SCSI Bus Sharing should be set to Virtual
  4. On the 2nd VM...
    1. Create the required shared disks on a new SCSI Bus ID, selecting Use an existing virtual disk (you'll need to locate the shared disks already created)
    2. Change the SCSI Bus Sharing mode of the new SCSI Controller to Virtual
  5. Boot the VM's up, disks should be visible from Disk Management from both VM's
    • Only format from one machine, NTFS doesn't support access from more than one host, MSCS needs to manage volume access/ownership

Increase Disk Size

Increasing the virtual disk size provided to a VM is straight forward (though be aware that snapshots need to be deleted 1st, if any exist)...

  1. Go into the VM's settings
  2. Increase the size of the disk and apply
  3. Within the VM's OS, rescan the disk, and the new space will be visible

The trick is to extend the logical partition within the OS. Depending on the original partition type and the OS, the options vary.

In-case of problems, see - Can't Increase a VM's Disk, or Extend Partition Fails

Increase Logical Partition

Generally boot or system disks cannot be extended whilst the OS is up, whereas normal data disk can be in later OS's, but this is still not ideal. Its generally most reliable to plan for system down time, and use a utility to extend the partition whilst its offline. Especially in a virtual environment there is no excuse for not making a backup of the partition 1st.

For Windows 2008 machines this isn't a problem.

For Windows 2003 machines...

Partition Type Options
System Either Cannot be extended
Data Basic Cannot be extended, can convert to Dynamic, but this will require at least a brief IO interruption, but up to two reboots!
Data Dynamic Can be extended on the fly, but a new volume is tagged onto the end of the existing partition to create a larger one made up of two volumes

Download a copy of the GParted Live CD - http://gparted.sourceforge.net/livecd.php, this will need to be booted to by the VM

  • Note There is a bug in some recent versions of GParted (v0.5.0-3 and v0.5.1-1 are known to have issues), whereby the boot fails with the following error, v0.4.6-1 is known to work
    • Unable to find a medium containing a live file system
  1. Increase the relevant VMDK size through the VM's options
  2. Start snapshoting (or take a full backup of the machine)
  3. Attach GParted ISO to VM and restart
    • If VM doesn't boot to the ISO, force the VM to boot to BIOS (Options | Advanced | Boot Options in VM Settings) and change the VM's boot order
  4. Boot into GParted Live (accepting the default options, except setting language to English UK)
  5. Once in GParted, follow the interface, and apply changes to action
  6. Restart VM and verify all is good
  7. Turn off snapshotting

VM's With Lots Of Disks

It can be very difficult to identify the correct disk within VMware to increase when a VM has a large number of VMDK's.

  • Disk numbering behalves differently, with Windows starting at Disk 0, and VMware starting a Disk 1
  • SCSI ID's will match, but Windows SCSI bus numbers are normally 0, whereas VMware bus numbers will increment (so VM disk 35 (Win disk 34), could be 2:4 in VMware, but 0:4 within the OS)
  • Disk size can be a useful method of validation (if differing disk sizes are used)
  • Windows drive letters are useless, never assume D: is disk 2 for example

Rename a VM

Renaming a virtual machine just by right-clicking over the machine and renaming does not alter the underlying file and folder names.

This is much easier than it was since the advent of storage vMotion. Bear in mind that the vCenter uses the VM's current name in order to determine the naming of its files/folders at it's destination, so to completely rename a VM...

  1. Rename the VM within its OS and restart
  2. Rename the VM within vCentre
  3. Storage vMotion the VM (can be moved back after)


Legacy methods...

To ensure that these changes take place you must move the VM to another datastore, ie

  1. Shutdown the VM
  2. Rename the VM in vCenter
  3. Migrate the VM and move it to another Datastore
  4. Restart the VM

If you can't move the VM to another datastore then it gets much more complicated, requiring faffing around in the service console.

  1. Shutdown the VM
  2. vmware-cmd -s unregister /vmfs/volumes/datastore/vm/vmold.vmx
  3. mv /vmfs/volumes/datastore/vm-old /vmfs/volumes/datastore/vm-new
  4. cd /vmfs/volumes/datastore/vm-new
  5. vmkfstools -E vm-old.vmdk vm-new.vmdk
  6. find . -name ‘*.vmx*’ -print -exec sed -e ‘s/vm-old/vm-new/g’ {} \;
  7. For every file that hasn’t been renamed (.vmsd etc.) mv vm-old.vmx vm-new.vmx
  8. vmware-cmd -s register /vmfs/volumes/datastore/vm-new/vm-new.vmx

The above was taxed from http://www.yellow-bricks.com/2008/02/10/howto-rename-a-vm/

VMware have now published an article on this VM KB 1029513

Clone a VM

This can done as

  • Hot clone - Source VM is left running, its disks are quiesced, and cloned. Can cause problems as new machine behaves as if it was ungracefully shutdown when first started, but normally successful. Source machine needs to be relatively quiet.
  • Cold clone - Source VM is shutdown 1st, preferable to a warm clone if possible.

Snapshots and Cloning

Snapshots are deleted during a clone, in that cloning a machine that has existing snapshots results in the post-snapshot changes being merged into the new machine.

In order to retain the snaphosts, the virtual machine needs to be cloned manually (untested procedure!!)...

  1. Copy all of the VMs files into a new directory (using vmkfstools --nosparse option).
  2. Correct the .vmx file to match new paths, update VM name, and delete the UUID line (VMware will prompt to generate a new one when the VM is started).
  3. Register the new VM in vCentre and double check the VM is as expected.
  4. Power on (you'll get an IP conflict if its on the same portgroup as the original)

Shutdown VM via Service Console

  • To determine state of an Virtual Machine running from the local ESX
    • vmware-cmd /vmfs/volumes/SAN1/ServerA/ServerA.vmx getstate
    • getstate() = on
  • Shutdown a Virtual Machine running from the local ESX gracefully
    • vmware-cmd /vmfs/volumes/SAN1/ServerA/ServerA.vmx stop trysoft
    • stop(hard) = 1
  • Shutdown a Virtual Machine running from the local ESX forcefully
    • vmware-cmd /vmfs/volumes/SAN1/ServerA/ServerA.vmx stop hard
    • stop(hard) = 1

The vmware-cmd command isn't available in ESXi, though it is available via the RCLI, in the following format...

  • vmware-cmd.pl /path/to/My_VM.vmx start --server MyESX --username root --password "RootPassword"

Upgrade ESX3 to ESX4

Preparation

  • Clean up the VM
    1. Stop any snapshots, and ensure there's no remnant snapshot files (*.vmsd, *-0000x.vmdk, *-delta.vmdk)
    2. No CD/floppy file attached
  • Clean up the guest OS
    1. Delete unnecessary files
    2. Ensure VM Tools is up to date
    3. Perform a reboot (without any changes)
    4. Check logs to ensure machine started without any significant errors
  • Record IP settings (they will get lost!)
    1. ipconfig/all
    2. route print if there might be static/persistent routes
  • Ensure you know the machines admin account (inc domain if on domain)
  • Shut the VM down

Procedure

Procedure assumes your migrating machines from a VI3 infrastructure to a new VI4/vSphere infrastructure. Note that you can you VMware Converter to copy machines between vCentre's if preferred.

  1. Export machine as a Virtual Appliance from VI3 infrastructure
  2. Import machine into new vSphere infrastructure
    • In the VI Client, select the VM and go to File | Deploy OVF Template..., and select the appropriate options in the resulting wizard
  3. Take a snapshot (if you make an irreversible mistake its quicker to revert to snapshot than reimport)
  4. Check the VM's settings, particularly Guest OS (which sometimes gets set to Other)
  5. Start the virtual machine, update VM Tools then shutdown
  6. Upgrade the virtual hardware
    • Right-click and select Upgrade Virtual Hardware
  7. Upgrade the network adapter to VMXNET3
    • Remove existing network adapters (note the networks they're connected to!), then add the same quota of VMXNET3 adapters (connected to the same networks in the same order)
  8. Upgrade the SCSI controller - part 1 (if required)
    • Add a new temporary disk, on the next bus (eg SCSI node 1:x)
    • Then change the new SCSI controller type to VMware Paravirtual
  9. Restore network config
    • Restart VM, and re-apply recorded network config (answer Yes when asked whether to remove duplicate config on non-existent adapter)
  10. Upgrade the SCSI controller - part 2 (if required)
    • Shutdown the VM, and remove the temporary disk added, and change the original SCSI controller to VMware Paravirtual (the other controller will automatically get removed from the config)
    • Restart the machine.
  11. Delete/Commit the snapshot

Windows 2008 Install

Use VMXNET3 network adapter. Only use Paravirtual SCSI interface if you're running at least ESX v4.1. You need to boot with the drivers on a floppy http://www.virtualinsanity.com/index.php/2009/12/01/more-bang-for-your-buck-with-pvscsi-part-2/

Convert Hardware v7 to v3

You'll need to download and install VMware Converter Standalone if haven't already got it installed (its free). The local installation will suffice (client-server not require). Its possible that you could get away with using the inbuilt vCentre version as long as you're not trying to import a v7 VM on ESX4 to a v4 VM on ESX3 (which you probably are!).

  1. Start up VMware Converter
  2. Hit the Convert machine buttton, top left
  3. On the resultant Source System page...
    • Change Select source type: to VMware Infrastructure virtual machine
    • Enter login details for the vCentre or ESX your v7 VM is on
  4. On the Source Machine page locate your v7 VM
  5. On the Destination System page, enter the login details of the vCentre or ESX where you want your v4 VM to be
  6. On the Destination Virtual Machine page, edit the VM name (if required) and select the destination folder
    • If you're migrating to an vSphere VC/ESX you must set the Virtual Machine Version on this page
  7. On the Destination Location page select an appropriate datastore
  8. On the Options' you can make any tweaks you might want to
  9. Finally confirm
  10. Once the machine is imported, boot it up to ensure all is OK
    • VM Tools will need to explicitly uninstalled and then reinstalled
    • Especially if VM is a template, the OS may want to adjust its drivers given that it'll be running on different (virtual) hardware)

Troubleshooting

See also Virtual Centre Troubleshooting

If all else fails you can always raise a VMware Service Request

Can't Connect to VM Console

Error connecting: Cannot connect to host... or Can't connect to MKS...

  • This is caused by a TCP connection failure to the ESX server the VM is hosted on. Using telnet or a port test utility, confirm you can connect on both TCP 902 and 443 from your machine to the ESX server.
  • If the problem is affecting a single ESX that previously worked, restart the management services on that ESX

Can't Deploy VM

The VirtualCenter server is unable to decrypt passwords stored in the customization specification

  • Bizarrely caused by the Virtual Centre running out of disk space, free up some space and all will be well.

A general system error occurred: Failed to create journal file provider

  • Check ESX disks are not full

Customization of the guest operating system 'winLonghornGuest' is not supported in this configuration. Microsoft Vista (TM) and Linux guests with Logical Volume Manager are supported only for recent ESX host and VMware Tools versions.

  • Caused by you trying to deploy a guest customised Windows 2008 template, where the OS of the source template is set to Windows 2008(!). Essentially Win2008 is only barely supported in ESX3.5. Setting the source machine to Vista should resolve this issue.
  • With Windows 2008 R2 templates the above fix has been seen to not work, in which case
    1. Deploy a clone (with no guest customisation)
    2. Perform a Sysprep

Can't Start VM

HA Admission Control

  • Can't start VM as doing so wouldn't leave enough failover capacity in order to be able to restart failed VM's should an ESX fail. Options are to
    • Reduce resource usage of VM's that are already running
    • Increase cluster capacity
    • Reduce the cluster's failover capacity, or allow constraints violations
  • If no VM's have been recently added to the cluster, its likely that the HA agent on one of the ESX's has stopped functioning, in which case, within the cluster, one of the ESX's will have a red warning/exclamation triangle. If so you can restart HA on that ESX;
    1. Highlight this VM, on the Summary tab you should see a notice regarding HA problems
    2. Run the Reconfigure for HA command, this will re-install the HA agent on the ESX

Failed to relocate virtual machine

  • DRS is attempting to relocate a VM at power up, and this relocation failing
    • Reattempt to power on machine
    • Manually migrate to a less loaded ESX and reattempt power on

Access to VMFS storage

  • ESX may have lost connectivity to VMFS partition on which VM resides

VMFS full

  • If VMFS is full, the ESX won't be able to write to the VM's logs when it starts it up, causing VM start-up to fail

ESX licensing

  • Either ESX isn't licensed, or has lost contact with the license server (VI3) for a long period of time

Waiting for question to be answered

  • Generally after changes (such as cold migrations or new deployments), a VM may need to have a question answered before it can continue to power on

Could not power on VM: No swap file. Failed to power on VM

  • The ESX you're starting the VM up on can't get proper access the VM's files, either because
    • The VM is already powered up on another ESX
    • The VM is already powered up (but shows as down on the VI Client)
    • The VM's files have been corrupted / locked


  1. Is the VM actually powered off?
  2. Has an ESX recently failed?
    • If the ESX the virtual is/was on has recently failed and HA's isolation response is set to leave powered-on then its possible that only the ESX's network connections have failed, and the virtual machines are still running on the ESX, but are isolated from the network.
      • To cause a full HA failover, pull the power cables out of the ESX to kill it completely
      • Alternatively, attempt to restore network connectivity to allow the VM's to be reachable again
    • If the ESX the virtual is/was on has recently failed its possible that the file lock times have not yet expired (or are being kept updated).
      • If you're able to get a console onto the failed ESX, ensure it has fully failed (powered off or PSOD). If not, power it off to ensure its not failed enough to stop VM's running, but not enough to stop updating the file locks. HA will restart the VM if its still a very recent failure, else restart the VM manually.

If there have been no ESX failures, then the VM's files may be corrupted. The VM can be re-registered by removing and re-adding it to the inventory, but the re-add may fail if the wrong files are corrupted. To investigate corruption further...

  • To test whether the ESX should be able to lock the VM's files use touch . Within the VM's directory, do touch *.vswp
    • If success, retry power on
    • If device or resource busy then the VM is probably owned by another ESX - find that ESX!
    • If Invalid argument then the file can't be accessed (eg corrupt or other storage problem)
  • Its also worth doing a touch on the following files, if they are not inaccessible then the VM may be recoverable. To work-around the .vswp issue, remove the reference to the file in the .vmx config file
    • touch *.vmx
    • touch *flat.vmdk
    • touch *delta.vmdk
    • touch vmware.log

For further info see - VMware KB10051 - Virtual machine does not power on because of missing or locked files

Cannot open the disk '/vmfs/volumes/.../MyVM-000001.vmdk' or one of the snapshot disks it depends on...

Cannot open the disk '/vmfs/volumes/.../MyVM-000001.vmdk' or one of the snapshot disks it depends on. Reason: The parent virtual disk has been modified since the child was deleted

  • The ESX can't work out the chain of vmdk's that make up the VM's disks, most likely because
    • Snapshot CID chain is corrupted
  1. You need to establish the chain of files, start by looking at the vmx file to work out the top vmdk, then track back through them until you get to the base disk.
    • Any vmdk files not referenced in this chain are erroneous and can be deleted (or better, moved to a temporary sub-folder)
    • Any delta file <= 16MB is effectively empty and can be skipped
  2. Now display the CID's stored and then work out their correct order
    • grep CID My-VM.vmdk My-VM-00000[1-9].vmdk
  3. You then need to edit the vmdk files to correct the CID chain
  4. Start the VM and confirm it's working as expected
  5. Create a new temporary snapshot, then remove it to clear them up

General system error occurred...

A general system error occurred: The system returned an error. Communication with the virtual machine might have been interrupted.

  • This error seems to be generally occurred when the ESX is having trouble launching the VM's processes, sometime because its having trouble reading the VM's VMX file.
    • If the problem is erratically effecting one or more VM's, its likely that the ESX's hostd process is struggling a bit - in which case restart the ESX management agents
    • If the problem is continually effecting one (or possibly more) VM's, the VM('s) config file may be corrupted, or storage may be experiencing problems.

Can't Stop / Power-Off a VM

This normally occurs because you've lost management (VI Client) access to the ESX, or the ESX doesn't appear to be aware that its running the VM, but it is (so appears Inaccessible via the VI Client). If you have access to the VM via the VI Client but can't power off, it'll probably be a permissioning issue. There is no way to gracefully shutdown a VM without access via the VI Client (or direct access to the VM via RDP, VNC, etc).

  1. SSH to the ESX you believe the VM is running on
  2. Find the path to the VM's config file
    • EG vmware-cmd -l | grep VM_Name
    • If the VM is not listed, the VM isn't registered to that ESX
  3. Instruct the ESX to power off the VM using the VMX path already found
    • EG vmware-cmd /path/to/VM_Name.vmx stop

If the above fails, you'll need to get a bit more forceful...

  1. Find the PID of the VM
    • EG ps -auxwww | grep VM_Name
  2. Kill the VM using the PID found (make sure you've got the right PID, you could kill the ESX by mistake!)
    • EG kill -9 1234

VM is Powered On, but appears Powered Off

The VM responds to ping and RDP/VNC/SSH etc (as appropriate) but is showing as down in the VI Client. Also see Confirm VM's Status on ESX

  1. Restart the management agents on the ESX and recheck

If that doesn't improve matters...

  1. Find the location of the vmx file for the VM (so it can be re-added to the inventory)
  2. Connect a VI Client to the ESX and unregister the VM (remove from inventory)
  3. Restart the management agents on the ESX
  4. Re-add the VM to the inventory

If running ESX4i see VMware KB 1033591 - Virtual machine appears powered off after restarting the management services on the host, but note that...

  • vMotion all powered-on VM's off the affected ESX first
  • Recover 1 VM at a time, and vMotion it off as soon as it is recovered (it may disappear when recovering the next VM)
  • Recovered VM's may end up with a state of Unknown on vCentre and ESX, in which case, remove from ESX inventory and re-add
  • Restart the ESX once all recovered

Can't VMotion a VM

VM network doesn't exist at destination

  • VM is using a particular port group which doesn’t exist on the destination ESX

ESX / network too busy

  • VMotion can’t copy across VMs memory contents/changes quickly enough. An alternative is to use a Low Priory VMotion, which is more likely to succeed, but may result in the VM experiencing temporary freezes (avoids full OS downtime, but not without impact to hosted applications)

ESXs can't communicate

  • ESXs need to be able to communicate via VMotion network. DNS problems and FQDN inaccuracies can also cause problems

VM is connect to CD-ROM/ISO

  • VMs CD-ROM is connecting to an ISO file via the host ESX, tying it to that ESX

Can't Increase a VM's Disk

A general system error occurred: Internal error

  • Can be caused by existing snapshots running on a VM
  • Check the ESX logs / available disk space etc

Can't Snapshot

Cannot create a quiesced snapshot because the create snapshot operation exceeded the time limit for holding off I/O in the frozen virtual machine

Can't Commit Snapshot

If snapshot files are large then patience is of the essence, and if possible, shut the VM down 1st, or at the very least limit activity on the VM. To commit a snapshot in a running VM, first a new snapshot is started, then the original redo files are merged with the base disk(s), then the extra redo file is merged.

Operation timed-out

  • Not unusual for large (>10GB) redo files, the process continues and its just vCentre reporting it as a time-out
    • Check the VM's files for any activity (changes in disk sizes/timestamps), speed is dependant on redo size, storage speed, ESX load, VM activity (if possible shut the VM down before removing the snapshot)
    • Also see Snapshot Still Active?

No Snapshots Exist in Snaphot Manager (but still exist)

  • Can happen if a snapshot Delete (All) fails to complete properly (eg ESX pseudo-hangs and you restart the management agents)
    1. Backup and then delete the VM's VMSD file
    2. Start a new snapshot
    3. In snapshot manager use Delete All (not Delete!)
  • If this fails, check the ESX log to see what went wrong

Snapshot Still Active?

  1. Check Snapshot Manager, if there's snapshots listed then there are still active snapshots
  2. Open up Datastore Browser to the VM's folder, and see if any snapshot files exist, if not then there are no active snapshots
  3. Check the VM's VMX file, the VMDK filename(s) will be either a snapshot or normal flat base disk file
    • EG scsi0:0.fileName = "MyVM-000001.vmdk" ←←←←← Snapshot file (snapshot running)
    • EG scsi0:0.fileName = "MyVM-000001-delta.vmdk" ← Snapshot file (snapshot running)
    • EG scsi0:0.fileName = "MyVM.vmdk" ←←←←←←←←← Base disk file (no snapshot running)
    • EG scsi0:0.fileName = "MyVM-flat.vmdk" ←←←←←← Base disk file (no snapshot running)
  4. If there's no snapshots running, but snapshot files exist then the files can be deleted (if you're sure!)

Can't Customise

Windows setup could not configure Windows to run on this computer's hardware
Windows could not complete the installation. To install Windows on this computer, restart the installation.

  • The guest customisation is failing because either
    • The virtual hardware has changed (especially disk type) since the original machine was created
    • Sysprep can't customise the machine because it doesn't have administrator rights, this can occur where a DC's users have been offloaded to LDS

VMTools Automatic Cursor Release Not Working

Sometimes the console automatic cursor release (which allows you to seamlessly switch focus from a VM console to your desktop by moving your mouse, avoiding having to use CTRL+ALT) sometimes doesn't work. Seems to be more common with VM's deployed from templates/cloned from VM's.

To resolve...

  1. Uninstall VM Tools
  2. Reboot
  3. Install VM Tools
  4. Reboot

Confirm VM's Status on ESX

The following commands take you through confirming the status of a VM, as determined by the ESX

  1. Get list of VM's registered to ESX to check ESX believes its hosting the VM
    • vm-support -x
  2. Get the VM's PID
    • vim-cmd vmsvc/getallvms | grep <VM name>
  3. Get the state of VM (as the ESX believes)
    • vim-cmd vmsvc/power.getstate <vmid>
  4. Check if the ESX has any running processes for the VM (in which case its powered on, regardless of the above)
    • ps | grep <VM name>

To check that a VM is being locked by the ESX you're on

  1. Get the lock info for the VM's disk (use the 1st if there's numerous)
    • vmkfstools -D <VM-name>-flat.vmdk
  2. Pick out the MAC address from the lock info (78e7d192a548 in example below)
  3. List the NIC info for the ESX
    • esxcfg-vmknic -l
Lock [type 10c00001 offset 72968192 v 470, hb offset 3985408
gen 583, mode 1, owner 4d2dcc7b-20fb6d90-2b80-78e7d192a548 mtime 25711553]
Addr <4, 151, 197>, gen 299, links 1, type reg, flags 0, uid 0, gid 0, mode 600
len 37580963840, nb 17688 tbz 0, cow 0, zla 3, bs 2097152