Virtual Machines

From vwiki
Jump to navigation Jump to search

Basic Virtual Machine Tasks

Start / Stop / Bounce a VM

  1. Log into the Virtual Infrastructure - Management Access
  2. Under the Inventory button, ensure Hosts and Clusters is ticked
  3. Highlight the VM you want to affect
  4. Either right-click or use the commands in the right hand pane to Power off, Power on, Reset as required

This is the same as using the Power or Reset buttons on the front of a physical server. It's possible to send Windows shut down etc commands to the VM; right click over the VM and select the appropriate Shut Down Guest, Restart Guest command. This tells VM Tools to attempt to perform the required action, obviously open applications etc can inhibit the successful shutdown of an OS.


Remote Console (KVM like) Access

If possible, its preferable to use normal remote access software (eg RDP, or VNC). This ensures that load caused by remote access is contained within the VM, rather than the ESX.

  1. Log into the Virtual Infrastructure - Management Access
  2. Under the Inventory button, ensure Hosts and Clusters is ticked
  3. Highlight the VM you want and either right click Open Console or use the Open Console command in the right hand pane


CD-ROM Access

There are essentially two ways to present a CD-ROM image to a VM, using an ISO image is by far and away the most flexible. Even if you only have a physical CD and expect to use it once, its still recommended that you create an ISO image from the CD and use that instead. The alternative is to put the physical media into the ESX hosting the VM (use Host Device when adding the CD to the VM).

To present an ISO image to a VM

  1. If its not already there, copy the ISO image to an NFS share or other ESX accessible datastore
  2. Log into the Virtual Infrastructure - Management Access
  3. Under the Inventory button, ensure Hosts and Clusters is ticked
  4. Highlight the VM you want to attach the ISO image to
  5. Right-click and select Edit Settings...
  6. Highlight the CD/DVD Drive, and select the Datastore ISO file
  7. Hit Browse and go into the appropriate datastore
  8. Select the required ISO file
  9. Tick the Connected check box
  10. Hit OK, the ISO will be attached to the VM's CDROM drive as if you'd inserted a CD into a physical drive
  • Once you've finished using the ISO, go back into the VM's settings and untick the Connected check box
  • To boot a VM to a CDROM ISO, check the "Connected at power on" checkbox and restart the VM's OS

To create an ISO image

You'll need to download an ISO creator, there are many freeware utilities available, however one that's tried and tested is ISORecorder. Generally you can create ISO images from both a physical CD, or just the contents of a folder (if you have ISORecorder installed, right-click over the disk or folder and select "Create ISO image")


Change Network Connection

In similar fashion to being able to swap over a network cable for a physical server, the network connection of a virtual machine can be changed on the fly

  1. Log into the Virtual Infrastructure - Management Access
  2. Under the Inventory button, ensure Hosts and Clusters is ticked
  3. Highlight the VM you want to change the network connection on
  4. Right-click and select Edit Settings...
  5. Hightlight the appropriate Network Adapter, and select the new Network Connection
  6. Change takes effect as soon as OK is hit


Add an Additional Network Connection

When adding additional network connections to any system you must consider network security, for example no system should ever be given access to both Private and Public networks.

  1. Shut down the Application and OS of the virtual machine
  2. Log into the Virtual Infrastructure - Management Access
  3. Under the Inventory button, ensure Hosts and Clusters is ticked
  4. Highlight the VM you want to add the network connection to
  5. Right-click and select Edit Settings...
  6. Hit the Add... button and select Ethernet Adapter, and hit Next
  7. Select the appropriate network connection and hit Next, and then Finish
  8. Power on the virtual machine


Change Physical Memory / CPU's Allocation

  1. Shut down the Application and OS of the virtual machine
  2. Log into the Virtual Infrastructure - Management Access
  3. Under the Inventory button, ensure Hosts and Clusters is ticked
  4. Highlight the VM you want to change the network connection on
  5. Right-click and select Edit Settings...
  6. Hightlight the appropriate setting, Memory or CPUs, and edit as required.
  7. Apply the change by hitting OK
  8. Power on the virtual machine

Config Settings

Performance Settings

In order to be able to squeeze the most of a virtual infrastructure its imperative that VM's are configured for best performance.

  • System Performance - Set the OS for best performance
    • In Win2003 got to System Properties | Advanced
      • Visual Effects - Disable/Adjust for best performance
      • Advanced - Tailor for the application/service the server is running
      • Virtual memory/page file - Should be set to a fixed size, and ideally as small as possible so as to conserve disk space usage. You need to know what you're doing when you set this. Do you expect your systems to be starved of memory and need to page? Personally for a VM with 4GB RAM assigned, I'd keep the page file down to 1GB in size.

Disable Shutdown Event Tracker

If the ESX servers are running as a HA cluster then they MUST be able to fully startup automatically after a re-boot. The Windows OS Shutdown tracker asks why you're shutting down or rebooting a system, or following an unexpected shutdown, halts the starting of a system pending information from the user. Not a problem for servers where all applications run as a service, but would impede VMware HA operating effectively where (GUI) applications need to start by stopping systems being restarted fully.

To disable...

  1. Start Group Policy Object Editor (Start | Run | gpedit.msc)
  2. Go to Computer Configuration\Administrative Templates\System
  3. Set Display Shutdown Event Tracker to Disabled

Set Low Risk File Types

If mapped drives are being used, .bat and .exe files need to be declared as low risk file types to stop Open file - Security Warning prompts being displayed when trying to run from mapped drives. This is particularly a problem if software is set to auto-start by placing shortcuts in the StartUp directory, as the software won't auto start.

To disable...

  1. Start Group Policy Object Editor (Start | Run | gpedit.msc)
  2. Go to User Configuration\Administrative Templates\Windows Components\Attachment Manager
  3. Set the "Default risk level for file types" to Enabled
  4. Specify the low extensions as .bat;.exe

Disable Balloon Driver

You should not disable the balloon driver - there are generally other ways to achieve whatever it is you're trying to achieve

Certain applications (such as MS SQL and Java) can lock the amount of 'physical' memory they want, and stop the OS being able to page it out to disk, in order tyo protect that applications performance. However, if a virtual environment becomes memory constrained, VMTools may attempt to force the OS to page memory to disk, but the OS will be unable to do so, and so will page itself out to disk, impacting the entire machines performance.

By using memory reservations its possible to ring-fence physical RAM for a machine, meaning that if contention for memory occurs, the ESX will prioritise on other virtual machines 1st in order to reduce overall physical memory usage. Ideally, some dynamic memory should still be left, eg allocate a VM with 4 GB of RAM with 3 GB reserved, 3GB being enough to allow the OS and memory locking application to stay in RAM, with an additional 1GB available to be used should there be available capacity.

To disable the balloon driver

  1. Shutdown the VM
  2. Go to Edit Settings, Options tab, then Advanced / General and hit Configuration Parameters
  3. Use Add Row to add a new config item
  4. Set sched.mem.maxmemct to 0
  5. Restart the VM

Windows XP

The following changes are performance based, intended to ensure XP isn't unnecessarily wasteful

  1. Display simplifications
    • Right-click over the desktop and got to Properties
      1. Switch to Classic theme
      2. Disable screen-saver
      3. Disable monitor power save
  2. Disable Indexing Services
    • Go to Control Panel | Add or Remove Programs | Add/Remove Windows Components and uncheck Index services
  3. Disable paging of the executive
    • HKLM\SYSTEM\CurrentControlSet\Control\Session Manager\Memory Management set DisablePagingExecutive to 1
  4. Set for best performance
    • My Computer | Properties go to Advanced Tab, Performance Settings, and Adjust for best performance
  5. Disable all sounds
  6. Defragment prefetch
    • defrag c: -b
  7. Turn off disk perf counters
    • diskperf -n

Files Information

File Purpose Notes
*.vmx VM config Contains the full config of the virtual machine
*.vmsd Snapshot metadata
*.vmsn Snapshot state Stores delta info that wouldn't be otherwise written to VMDK file (eg Power State, Hardware changes, etc)
*.vmss Suspend state Created when a VM is suspended to contain its physical memory contents
*.vmxf Team config
*.vmdk Virtual hard-drive
*.nvram vBIOS Can be deleted, gets recreated on VM start (BIOS settings will be defaulted)
*.vswp VM memory swap Can be deleted, but you need to remove the reference to it from the VMX file
*.hlog VMotion helper Can be deleted, as long as there's not a VMotion in progress

Procedures

Creating MSCS Machines

There are various different configuration options for creating MS clusters. For production standard clusters you must use RDM, otherwise you're forced to host both your VM's on the same ESX (and you can't vMotion them!).

  • You can't snapshot disks that are configured to use SCSI Bus Sharing.
Disk Type SCSI Bus Sharing Cluster in a Box
2 VM's / 1 fixed ESX
Cluster Across Boxes
2 VM's / multi ESX
Virtual/Physical Hybrid
1 VM + 1 Physical
Normal VMDK Virtual Yes No No
RDM Virtual Yes Yes - Win2k3 only No
RDM Physical No Yes Yes
  • Window 2003 servers must use LSI Logic Parallel SCSI controller for shared disks
  • Window 2008 servers must use LSI Logic SAS SCSI controller for shared disks

See the following for further info

Cluster in a Box

Procedure assumes you're creating a VM cluster using 2 VM's on the same server sharing standard VMDK disks (not RDM's)

  1. Create a new private vSwitch on the ESX to host the VM's called "MSCS Heartbeat"
  2. Create two VM's...
    • Don't create the shared disks yet
    1. With 2 NIC's, 1st attached to normal (externally accessible) network, 2nd attached to private "MSCS Heartbeat" network
    2. Boot up and ensure they're working as expected (not as a cluster yet), then shutdown
  3. On the 1st VM...
    1. Create the required shared disks on a new SCSI Bus ID
      • Tick Support clustering features such as Fault Tolerance, which ensures the disks are created in eagerzeroedthick format
      • Select a new SCSI Bus ID in the Virtual Device Node drop-down box, which creates the disks on a new SCSI Controller
    2. Change the new SCSI Controller's config...
      • SCSI Controller Type should be set to LSI Logic SAS (Win2k8) or LSI Logic Parallel (Win2k3)
      • SCSI Bus Sharing should be set to Virtual
  4. On the 2nd VM...
    1. Create the required shared disks on a new SCSI Bus ID, selecting Use an existing virtual disk (you'll need to locate the shared disks already created)
    2. Change the SCSI Bus Sharing mode of the new SCSI Controller to Virtual
  5. Boot the VM's up, disks should be visible from Disk Management from both VM's
    • Only format from one machine, NTFS doesn't support access from more than one host, MSCS needs to manage volume access/ownership

Increase Disk Size

Increasing the virtual disk size provided to a VM is straight forward (though be aware that snapshots need to be deleted 1st, if any exist)...

  1. Go into the VM's settings
  2. Increase the size of the disk and apply
  3. Within the VM's OS, rescan the disk, and the new space will be visible

The trick is to extend the logical partition within the OS. Depending on the original partition type and the OS, the options vary.

In-case of problems, see - Can't Increase a VM's Disk

Increase Logical Partition

Generally boot or system disks cannot be extended whilst the OS is up, whereas normal data disk can be in later OS's, but this is still not ideal. Its generally most reliable to plan for system down time, and use a utility to extend the partition whilst its offline. Especially in a virtual environment there is no excuse for not making a backup of the partition 1st.

For Windows 2008 machines this isn't a problem.

For Windows 2003 machines...

Partition Type Options
System Either Cannot be extended
Data Basic Cannot be extended, can convert to Dynamic, but this will require at least a brief IO interruption, but up to two reboots!
Data Dynamic Can be extended on the fly, but a new volume is tagged onto the end of the existing partition to create a larger one made up of two volumes

Download a copy of the GParted Live CD - http://gparted.sourceforge.net/livecd.php, this will need to be booted to by the VM

  • Note There is a bug in some recent versions of GParted (v0.5.0-3 and v0.5.1-1 are known to have issues), whereby the boot fails with the following error, v0.4.6-1 is known to work
    • Unable to find a medium containing a live file system
  1. Increase the relevant VMDK size through the VM's options
  2. Start snapshoting (or take a full backup of the machine)
  3. Attach GParted ISO to VM and restart
    • If VM doesn't boot to the ISO, force the VM to boot to BIOS (Options | Advanced | Boot Options in VM Settings) and change the VM's boot order
  4. Boot into GParted Live (accepting the default options, except setting language to English UK)
  5. Once in GParted, follow the interface, and apply changes to action
  6. Restart VM and verify all is good
  7. Turn off snapshotting

VM's With Lots Of Disks

It can be very difficult to identify the correct disk within VMware to increase when a VM has a large number of VMDK's.

  • Disk numbering behalves differently, with Windows starting at Disk 0, and VMware starting a Disk 1
  • SCSI ID's will match, but Windows SCSI bus numbers are normally 0, whereas VMware bus numbers will increment (so VM disk 35 (Win disk 34), could be 2:4 in VMware, but 0:4 within the OS)
  • Disk size can be a useful method of validation (if differing disk sizes are used)
  • Windows drive letters are useless, never assume D: is disk 2 for example

Rename a VM

Renaming a virtual machine just by right-clicking over the machine and renaming does not alter the underlying file and folder names.

This is much easier than it was since the advent of storage vMotion. Bear in mind that the vCenter uses the VM's current name in order to determine the naming of its files/folders at it's destination, so to completely rename a VM...

  1. Rename the VM within its OS and restart
  2. Rename the VM within vCentre
  3. Storage vMotion the VM (can be moved back after)


Legacy methods...

To ensure that these changes take place you must move the VM to another datastore, ie

  1. Shutdown the VM
  2. Rename the VM in vCenter
  3. Migrate the VM and move it to another Datastore
  4. Restart the VM

If you can't move the VM to another datastore then it gets much more complicated, requiring faffing around in the service console.

  1. Shutdown the VM
  2. vmware-cmd -s unregister /vmfs/volumes/datastore/vm/vmold.vmx
  3. mv /vmfs/volumes/datastore/vm-old /vmfs/volumes/datastore/vm-new
  4. cd /vmfs/volumes/datastore/vm-new
  5. vmkfstools -E vm-old.vmdk vm-new.vmdk
  6. find . -name ‘*.vmx*’ -print -exec sed -e ‘s/vm-old/vm-new/g’ {} \;
  7. For every file that hasn’t been renamed (.vmsd etc.) mv vm-old.vmx vm-new.vmx
  8. vmware-cmd -s register /vmfs/volumes/datastore/vm-new/vm-new.vmx

The above was taxed from http://www.yellow-bricks.com/2008/02/10/howto-rename-a-vm/

VMware have now published an article on this VM KB 1029513

Clone a VM

This can done as

  • Hot clone - Source VM is left running, its disks are quiesced, and cloned. Can cause problems as new machine behaves as if it was ungracefully shutdown when first started, but normally successful. Source machine needs to be relatively quiet.
  • Cold clone - Source VM is shutdown 1st, preferable to a warm clone if possible.

Snapshots and Cloning

Snapshots are deleted during a clone, in that cloning a machine that has existing snapshots results in the post-snapshot changes being merged into the new machine.

In order to retain the snaphosts, the virtual machine needs to be cloned manually (untested procedure!!)...

  1. Copy all of the VMs files into a new directory (using vmkfstools --nosparse option).
  2. Correct the .vmx file to match new paths, update VM name, and delete the UUID line (VMware will prompt to generate a new one when the VM is started).
  3. Register the new VM in vCentre and double check the VM is as expected.
  4. Power on (you'll get an IP conflict if its on the same portgroup as the original)

Shutdown VM via Service Console

  • To determine state of an Virtual Machine running from the local ESX
    • vmware-cmd /vmfs/volumes/SAN1/ServerA/ServerA.vmx getstate
    • getstate() = on
  • Shutdown a Virtual Machine running from the local ESX gracefully
    • vmware-cmd /vmfs/volumes/SAN1/ServerA/ServerA.vmx stop trysoft
    • stop(hard) = 1
  • Shutdown a Virtual Machine running from the local ESX forcefully
    • vmware-cmd /vmfs/volumes/SAN1/ServerA/ServerA.vmx stop hard
    • stop(hard) = 1

The vmware-cmd command isn't available in ESXi, though it is available via the RCLI, in the following format...

  • vmware-cmd.pl /path/to/My_VM.vmx start --server MyESX --username root --password "RootPassword"

Upgrade ESX3 to ESX4

Preparation

  • Clean up the VM
    1. Stop any snapshots, and ensure there's no remnant snapshot files (*.vmsd, *-0000x.vmdk, *-delta.vmdk)
    2. No CD/floppy file attached
  • Clean up the guest OS
    1. Delete unnecessary files
    2. Ensure VM Tools is up to date
    3. Perform a reboot (without any changes)
    4. Check logs to ensure machine started without any significant errors
  • Record IP settings (they will get lost!)
    1. ipconfig/all
    2. route print if there might be static/persistent routes
  • Ensure you know the machines admin account (inc domain if on domain)
  • Shut the VM down

Procedure

Procedure assumes your migrating machines from a VI3 infrastructure to a new VI4/vSphere infrastructure. Note that you can you VMware Converter to copy machines between vCentre's if preferred.

  1. Export machine as a Virtual Appliance from VI3 infrastructure
  2. Import machine into new vSphere infrastructure
    • In the VI Client, select the VM and go to File | Deploy OVF Template..., and select the appropriate options in the resulting wizard
  3. Take a snapshot (if you make an irreversible mistake its quicker to revert to snapshot than reimport)
  4. Check the VM's settings, particularly Guest OS (which sometimes gets set to Other)
  5. Start the virtual machine, update VM Tools then shutdown
  6. Upgrade the virtual hardware
    • Right-click and select Upgrade Virtual Hardware
  7. Upgrade the network adapter to VMXNET3
    • Remove existing network adapters (note the networks they're connected to!), then add the same quota of VMXNET3 adapters (connected to the same networks in the same order)
  8. Upgrade the SCSI controller - part 1 (if required)
    • Add a new temporary disk, on the next bus (eg SCSI node 1:x)
    • Then change the new SCSI controller type to VMware Paravirtual
  9. Restore network config
    • Restart VM, and re-apply recorded network config (answer Yes when asked whether to remove duplicate config on non-existent adapter)
  10. Upgrade the SCSI controller - part 2 (if required)
    • Shutdown the VM, and remove the temporary disk added, and change the original SCSI controller to VMware Paravirtual (the other controller will automatically get removed from the config)
    • Restart the machine.
  11. Delete/Commit the snapshot

Windows 2008 Install

Use VMXNET3 network adapter. Only use Paravirtual SCSI interface if you're running at least ESX v4.1. You need to boot with the drivers on a floppy http://www.virtualinsanity.com/index.php/2009/12/01/more-bang-for-your-buck-with-pvscsi-part-2/

Convert Hardware v7 to v3

You'll need to download and install VMware Converter Standalone if haven't already got it installed (its free). The local installation will suffice (client-server not require). Its possible that you could get away with using the inbuilt vCentre version as long as you're not trying to import a v7 VM on ESX4 to a v4 VM on ESX3 (which you probably are!).

  1. Start up VMware Converter
  2. Hit the Convert machine buttton, top left
  3. On the resultant Source System page...
    • Change Select source type: to VMware Infrastructure virtual machine
    • Enter login details for the vCentre or ESX your v7 VM is on
  4. On the Source Machine page locate your v7 VM
  5. On the Destination System page, enter the login details of the vCentre or ESX where you want your v4 VM to be
  6. On the Destination Virtual Machine page, edit the VM name (if required) and select the destination folder
    • If you're migrating to an vSphere VC/ESX you must set the Virtual Machine Version on this page
  7. On the Destination Location page select an appropriate datastore
  8. On the Options' you can make any tweaks you might want to
  9. Finally confirm
  10. Once the machine is imported, boot it up to ensure all is OK
    • VM Tools will need to explicitly uninstalled and then reinstalled
    • Especially if VM is a template, the OS may want to adjust its drivers given that it'll be running on different (virtual) hardware)

Troubleshooting

See also Virtual Centre Troubleshooting

If all else fails you can always raise a VMware Service Request

Can't Connect to VM Console

Error connecting: Cannot connect to host... or Can't connect to MKS...

  • This is caused by a TCP connection failure to the ESX server the VM is hosted on. Using telnet or a port test utility, confirm you can connect on both TCP 902 and 443 from your machine to the ESX server.
  • If the problem is affecting a single ESX that previously worked, restart the management services on that ESX

Can't Deploy VM

The VirtualCenter server is unable to decrypt passwords stored in the customization specification

  • Bizarrely caused by the Virtual Centre running out of disk space, free up some space and all will be well.

A general system error occurred: Failed to create journal file provider

  • Check ESX disks are not full

Customization of the guest operating system 'winLonghornGuest' is not supported in this configuration. Microsoft Vista (TM) and Linux guests with Logical Volume Manager are supported only for recent ESX host and VMware Tools versions.

  • Caused by you trying to deploy a guest customised Windows 2008 template, where the OS of the source template is set to Windows 2008(!). Essentially Win2008 is only barely supported in ESX3.5. Setting the source machine to Vista should resolve this issue.
  • With Windows 2008 R2 templates the above fix has been seen to not work, in which case
    1. Deploy a clone (with no guest customisation)
    2. Perform a Sysprep

Can't Start VM

HA Admission Control

  • Can't start VM as doing so wouldn't leave enough failover capacity in order to be able to restart failed VM's should an ESX fail. Options are to
    • Reduce resource usage of VM's that are already running
    • Increase cluster capacity
    • Reduce the cluster's failover capacity, or allow constraints violations
  • If no VM's have been recently added to the cluster, its likely that the HA agent on one of the ESX's has stopped functioning, in which case, within the cluster, one of the ESX's will have a red warning/exclamation triangle. If so you can restart HA on that ESX;
    1. Highlight this VM, on the Summary tab you should see a notice regarding HA problems
    2. Run the Reconfigure for HA command, this will re-install the HA agent on the ESX

Failed to relocate virtual machine

  • DRS is attempting to relocate a VM at power up, and this relocation failing
    • Reattempt to power on machine
    • Manually migrate to a less loaded ESX and reattempt power on

Access to VMFS storage

  • ESX may have lost connectivity to VMFS partition on which VM resides

VMFS full

  • If VMFS is full, the ESX won't be able to write to the VM's logs when it starts it up, causing VM start-up to fail

ESX licensing

  • Either ESX isn't licensed, or has lost contact with the license server (VI3) for a long period of time

Waiting for question to be answered

  • Generally after changes (such as cold migrations or new deployments), a VM may need to have a question answered before it can continue to power on

Could not power on VM: No swap file. Failed to power on VM

  • The ESX you're starting the VM up on can't get proper access the VM's files, either because
    • The VM is already powered up on another ESX
    • The VM is already powered up (but shows as down on the VI Client)
    • The VM's files have been corrupted / locked


  1. Is the VM actually powered off?
  2. Has an ESX recently failed?
    • If the ESX the virtual is/was on has recently failed and HA's isolation response is set to leave powered-on then its possible that only the ESX's network connections have failed, and the virtual machines are still running on the ESX, but are isolated from the network.
      • To cause a full HA failover, pull the power cables out of the ESX to kill it completely
      • Alternatively, attempt to restore network connectivity to allow the VM's to be reachable again
    • If the ESX the virtual is/was on has recently failed its possible that the file lock times have not yet expired (or are being kept updated).
      • If you're able to get a console onto the failed ESX, ensure it has fully failed (powered off or PSOD). If not, power it off to ensure its not failed enough to stop VM's running, but not enough to stop updating the file locks. HA will restart the VM if its still a very recent failure, else restart the VM manually.

If there have been no ESX failures, then the VM's files may be corrupted. The VM can be re-registered by removing and re-adding it to the inventory, but the re-add may fail if the wrong files are corrupted. To investigate corruption further...

  • To test whether the ESX should be able to lock the VM's files use touch . Within the VM's directory, do touch *.vswp
    • If success, retry power on
    • If device or resource busy then the VM is probably owned by another ESX - find that ESX!
    • If Invalid argument then the file can't be accessed (eg corrupt or other storage problem)
  • Its also worth doing a touch on the following files, if they are not inaccessible then the VM may be recoverable. To work-around the .vswp issue, remove the reference to the file in the .vmx config file
    • touch *.vmx
    • touch *flat.vmdk
    • touch *delta.vmdk
    • touch vmware.log

For further info see - VMware KB10051 - Virtual machine does not power on because of missing or locked files

Cannot open the disk '/vmfs/volumes/.../MyVM-000001.vmdk' or one of the snapshot disks it depends on...

Cannot open the disk '/vmfs/volumes/.../MyVM-000001.vmdk' or one of the snapshot disks it depends on. Reason: The parent virtual disk has been modified since the child was deleted

  • The ESX can't work out the chain of vmdk's that make up the VM's disks, most likely because
    • Snapshot CID chain is corrupted
  1. You need to establish the chain of files, start by looking at the vmx file to work out the top vmdk, then track back through them until you get to the base disk.
    • Any vmdk files not referenced in this chain are erroneous and can be deleted (or better, moved to a temporary sub-folder)
    • Any delta file <= 16MB is effectively empty and can be skipped
  2. Now display the CID's stored and then work out their correct order
    • grep CID My-VM.vmdk My-VM-00000[1-9].vmdk
  3. You then need to edit the vmdk files to correct the CID chain
  4. Start the VM and confirm it's working as expected
  5. Create a new temporary snapshot, then remove it to clear them up

Can't Stop / Power-Off a VM

This normally occurs because you've lost management (VI Client) access to the ESX, or the ESX doesn't appear to be aware that its running the VM, but it is (so appears Inaccessible via the VI Client). If you have access to the VM via the VI Client but can't power off, it'll probably be a permissioning issue. There is no way to gracefully shutdown a VM without access via the VI Client (or direct access to the VM via RDP, VNC, etc).

  1. SSH to the ESX you believe the VM is running on
  2. Find the path to the VM's config file
    • EG vmware-cmd -l | grep VM_Name
    • If the VM is not listed, the VM isn't registered to that ESX
  3. Instruct the ESX to power off the VM using the VMX path already found
    • EG vmware-cmd /path/to/VM_Name.vmx stop

If the above fails, you'll need to get a bit more forceful...

  1. Find the PID of the VM
    • EG ps -auxwww | grep VM_Name
  2. Kill the VM using the PID found (make sure you've got the right PID, you could kill the ESX by mistake!)
    • EG kill -9 1234


Can't VMotion a VM

VM network doesn't exist at destination

  • VM is using a particular port group which doesn’t exist on the destination ESX

ESX / network too busy

  • VMotion can’t copy across VMs memory contents/changes quickly enough. An alternative is to use a Low Priory VMotion, which is more likely to succeed, but may result in the VM experiencing temporary freezes (avoids full OS downtime, but not without impact to hosted applications)

ESXs can't communicate

  • ESXs need to be able to communicate via VMotion network. DNS problems and FQDN inaccuracies can also cause problems

VM is connect to CD-ROM/ISO

  • VMs CD-ROM is connecting to an ISO file via the host ESX, tying it to that ESX

Can't Increase a VM's Disk

A general system error occurred: Internal error

  • Can be caused by existing snapshots running on a VM
  • Check the ESX logs / available disk space etc

Can't Snapshot

Cannot create a quiesced snapshot because the create snapshot operation exceeded the time limit for holding off I/O in the frozen virtual machine

Can't Commit Snapshot

If snapshot files are large then patience is of the essence, and if possible, shut the VM down 1st, or at the very least limit activity on the VM. To commit a snapshot in a running VM, first a new snapshot is started, then the original redo files are merged with the base disk(s), then the extra redo file is merged.

Operation timed-out

  • Not unusual for large (>10GB) redo files, the process continues and its just vCentre reporting it as a time-out
    • Check the VM's files for any activity (changes in disk sizes/timestamps), speed is dependant on redo size, storage speed, ESX load, VM activity (if possible shut the VM down before removing the snapshot)
    • Also see Snapshot Still Active?

No Snapshots Exist in Snaphot Manager (but still exist)

  • Can happen if a snapshot Delete (All) fails to complete properly (eg ESX pseudo-hangs and you restart the management agents)
    1. Backup and then delete the VM's VMSD file
    2. Start a new snapshot
    3. In snapshot manager use Delete All (not Delete!)
  • If this fails, check the ESX log to see what went wrong

Snapshot Still Active?

  1. Check Snapshot Manager, if there's snapshots listed then there are still active snapshots
  2. Open up Datastore Browser to the VM's folder, and see if any snapshot files exist, if not then there are no active snapshots
  3. Check the VM's VMX file, the VMDK filename(s) will be either a snapshot or normal flat base disk file
    • EG scsi0:0.fileName = "MyVM-000001.vmdk" ←←←←← Snapshot file (snapshot running)
    • EG scsi0:0.fileName = "MyVM-000001-delta.vmdk" ← Snapshot file (snapshot running)
    • EG scsi0:0.fileName = "MyVM.vmdk" ←←←←←←←←← Base disk file (no snapshot running)
    • EG scsi0:0.fileName = "MyVM-flat.vmdk" ←←←←←← Base disk file (no snapshot running)
  4. If there's no snapshots running, but snapshot files exist then the files can be deleted (if you're sure!)

Can't Customise

Windows setup could not configure Windows to run on this computer's hardware
Windows could not complete the installation. To install Windows on this computer, restart the installation.

  • The guest customisation is failing because either
    • The virtual hardware has changed (especially disk type) since the original machine was created
    • Sysprep can't customise the machine because it doesn't have administrator rights, this can occur where a DC's users have been offloaded to LDS