Virtual Machines: Difference between revisions
(→Can't Remove Snapshot: Added "No Snapshots Exist in Snaphot Manager (but still exist)") |
(→Troubleshooting: Added "Snapshot Still Active?" and renamed "Can't Remove Snapshot" to "Can't Commit Snapshot") |
||
Line 235: | Line 235: | ||
'''ESXs can't communicate''' | '''ESXs can't communicate''' | ||
* ESXs need to be able to communicate via VMotion network. DNS problems and FQDN inaccuracies can also cause problems | * ESXs need to be able to communicate via VMotion network. DNS problems and [[Acronyms#F|FQDN]] inaccuracies can also cause problems | ||
'''VM is connect to CD-ROM/ISO''' | '''VM is connect to CD-ROM/ISO''' | ||
Line 250: | Line 250: | ||
** [http://kb.vmware.com/kb/1009073 VMware KB1009073 - Unable to take a quiesced VMware snapshot of a virtual machine] | ** [http://kb.vmware.com/kb/1009073 VMware KB1009073 - Unable to take a quiesced VMware snapshot of a virtual machine] | ||
=== Can't | === Can't Commit Snapshot === | ||
'''Operation timed-out''' | '''Operation timed-out''' | ||
* Not unusual for large redo files, the process continues and its just vCentre reporting it as a time-out | * Not unusual for large (>10GB) redo files, the process continues and its just vCentre reporting it as a time-out | ||
** Check the VM's files for any activity, | ** Check the VM's files for any activity (changes in disk sizes), very large redo files (eg 100 GB) could take 4hrs! | ||
** Also see [[Virtual_Machines#Snapshot Still Active?|Snapshot Still Active?]] | |||
'''No Snapshots Exist in Snaphot Manager (but still exist)''' | '''No Snapshots Exist in Snaphot Manager (but still exist)''' | ||
Line 259: | Line 260: | ||
*# Start a new snapshot | *# Start a new snapshot | ||
*# In snapshot manager use ''Delete All'' (not Delete!) | *# In snapshot manager use ''Delete All'' (not Delete!) | ||
=== Snapshot Still Active? === | |||
# Check ''Snapshot Manager'', if there's snapshots listed then there are still active snapshots | |||
# Open up ''Datastore Browser'' to the VM's folder, and see if any snapshot files exist, if not then there are no active snapshots | |||
# Check the VM's VMX file, the VMDK filename(s) will be either a snapshot or normal flat base disk file | |||
#* EG <code> scsi0:0.fileName = "MyVM-000001.vmdk" </code> ←←←←← Snapshot file ''(snapshot running)'' | |||
#* EG <code> scsi0:0.fileName = "MyVM-000001-delta.vmdk" </code> ← Snapshot file ''(snapshot running)'' | |||
#* EG <code> scsi0:0.fileName = "MyVM.vmdk" </code> ←←←←←←←←← Base disk file ''(no snapshot running)'' | |||
#* EG <code> scsi0:0.fileName = "MyVM-flat.vmdk" </code> ←←←←←← Base disk file ''(no snapshot running)'' | |||
# If there's no snapshots running, but snapshot files exist then the files can be deleted (if you're sure!) | |||
=== Can't Customise === | === Can't Customise === |
Revision as of 14:07, 18 August 2010
Basic Virtual Machine Tasks
Start / Stop / Bounce a VM
- Log into the Virtual Infrastructure - Management Access
- Under the Inventory button, ensure Hosts and Clusters is ticked
- Highlight the VM you want to affect
- Either right-click or use the commands in the right hand pane to Power off, Power on, Reset as required
This is the same as using the Power or Reset buttons on the front of a physical server. It's possible to send Windows shut down etc commands to the VM; right click over the VM and select the appropriate Shut Down Guest, Restart Guest command. This tells VM Tools to attempt to perform the required action, obviously open applications etc can inhibit the successful shutdown of an OS.
Remote Console (KVM like) Access
If possible, its preferable to use normal remote access software (eg RDP, or VNC). This ensures that load caused by remote access is contained within the VM, rather than the ESX.
- Log into the Virtual Infrastructure - Management Access
- Under the Inventory button, ensure Hosts and Clusters is ticked
- Highlight the VM you want and either right click Open Console or use the Open Console command in the right hand pane
CD-ROM Access
There are essentially two ways to present a CD-ROM image to a VM, using an ISO image is by far and away the most flexible. Even if you only have a physical CD and expect to use it once, its still recommended that you create an ISO image from the CD and use that instead. The alternative is to put the physical media into the ESX hosting the VM (use Host Device when adding the CD to the VM).
To present an ISO image to a VM
- If its not already there, copy the ISO image to an NFS share or other ESX accessible datastore
- Log into the Virtual Infrastructure - Management Access
- Under the Inventory button, ensure Hosts and Clusters is ticked
- Highlight the VM you want to attach the ISO image to
- Right-click and select Edit Settings...
- Highlight the CD/DVD Drive, and select the Datastore ISO file
- Hit Browse and go into the appropriate datastore
- Select the required ISO file
- Tick the Connected check box
- Hit OK, the ISO will be attached to the VM's CDROM drive as if you'd inserted a CD into a physical drive
- Once you've finished using the ISO, go back into the VM's settings and untick the Connected check box
- To boot a VM to a CDROM ISO, check the "Connected at power on" checkbox and restart the VM's OS
To create an ISO image
You'll need to download an ISO creator, there are many freeware utilities available, however one that's tried and tested is ISORecorder. Generally you can create ISO images from both a physical CD, or just the contents of a folder (if you have ISORecorder installed, right-click over the disk or folder and select "Create ISO image")
Change Network Connection
In similar fashion to being able to swap over a network cable for a physical server, the network connection of a virtual machine can be changed on the fly
- Log into the Virtual Infrastructure - Management Access
- Under the Inventory button, ensure Hosts and Clusters is ticked
- Highlight the VM you want to change the network connection on
- Right-click and select Edit Settings...
- Hightlight the appropriate Network Adapter, and select the new Network Connection
- Change takes effect as soon as OK is hit
Add an Additional Network Connection
When adding additional network connections to any system you must consider network security, for example no system should ever be given access to both Private and Public networks.
- Shut down the Application and OS of the virtual machine
- Log into the Virtual Infrastructure - Management Access
- Under the Inventory button, ensure Hosts and Clusters is ticked
- Highlight the VM you want to add the network connection to
- Right-click and select Edit Settings...
- Hit the Add... button and select Ethernet Adapter, and hit Next
- Select the appropriate network connection and hit Next, and then Finish
- Power on the virtual machine
Change Physical Memory / CPU's Allocation
- Shut down the Application and OS of the virtual machine
- Log into the Virtual Infrastructure - Management Access
- Under the Inventory button, ensure Hosts and Clusters is ticked
- Highlight the VM you want to change the network connection on
- Right-click and select Edit Settings...
- Hightlight the appropriate setting, Memory or CPUs, and edit as required.
- Apply the change by hitting OK
- Power on the virtual machine
Config Settings
Disable Shutdown Event Tracker
If the ESX servers are running as a HA cluster then they MUST be able to fully startup automatically after a re-boot. The Windows OS Shutdown tracker asks why you're shutting down or rebooting a system, or following an unexpected shutdown, halts the starting of a system pending information from the user. Not a problem for servers where all applications run as a service, but would impede VMware HA operating effectively where (GUI) applications need to start by stopping systems being restarted fully.
To disable...
- Start Group Policy Object Editor (Start | Run | gpedit.msc)
- Go to Computer Configuration\Administrative Templates\System
- Set Display Shutdown Event Tracker to Disabled
Set Low Risk File Types
If mapped drives are being used, .bat and .exe files need to be declared as low risk file types to stop Open file - Security Warning prompts being displayed when trying to run from mapped drives. This is particularly a problem if software is set to auto-start by placing shortcuts in the StartUp directory, as the software won't auto start.
To disable...
- Start Group Policy Object Editor (Start | Run | gpedit.msc)
- Go to User Configuration\Administrative Templates\Windows Components\Attachment Manager
- Set the "Default risk level for file types" to Enabled
- Specify the low extensions as
.bat;.exe
Increase Disk Size
Increasing the virtual disk size provided to a VM is straight forward (though be aware that snapshots need to be deleted 1st, if any exist)...
- Go into the VM's settings
- Increase the size of the disk and apply
- Within the VM's OS, rescan the disk, and the new space will be visible
The trick is to extend the logical partition within the OS. Depending on the original partition type and the OS, the options vary.
Increase Logical Partition
Generally boot or system disks cannot be extended whilst the OS is up, whereas normal data disk can be in later OS's, but this is still not ideal. Its generally most reliable to plan for system down time, and use a utility to extend the partition whilst its offline. Especially in a virtual environment there is no excuse for not making a backup of the partition 1st.
For Windows 2008 machines this isn't a problem.
For Windows 2003 machines...
Partition | Type | Options |
---|---|---|
System | Either | Cannot be extended |
Data | Basic | Cannot be extended, can convert to Dynamic, but this will require a brief IO interruption. |
Data | Dynamic | Can be extended on the fly, but a new volume is tagged onto the end of the existing partition to create a larger one made up of two volumes |
Download a copy of the GParted Live CD - http://gparted.sourceforge.net/livecd.php, this will need to be booted to by the VM
- Note There is a bug in some recent versions of GParted (v0.5.0-3 and v0.5.1-1 are known to have issues), whereby the boot fails with the following error, v0.4.6-1 is known to work
Unable to find a medium containing a live file system
- Increase the relevant VMDK size through the VM's options
- Start snapshoting (or take a full backup of the machine)
- Attach GParted ISO to VM and restart
- If VM doesn't boot to the ISO, force the VM to boot to BIOS (Options | Advanced | Boot Options in VM Settings) and change the VM's boot order
- Boot into GParted Live (accepting the default options, except setting language to English UK)
- Once in GParted, follow the interface, and apply changes to action
- Restart VM and verify all is good
- Turn off snapshotting
VM's With Lots Of Disks
It can be very difficult to identify the correct disk within VMware to increase when a VM has a large number of VMDK's.
- Disk numbering behalves differently, with Windows starting at Disk 0, and VMware starting a Disk 1
- SCSI ID's will match, but Windows SCSI bus numbers are normally 0, whereas VMware bus numbers will increment (so VM disk 35 (Win disk 34), could be 2:4 in VMware, but 0:4 within the OS)
- Disk size can be a useful method of validation (if differing disk sizes are used)
- Windows drive letters are useless, never assume D: is disk 2 for example
Rename a VM
Renaming a virtual machine just by right-clicking over the machine and renaming does not alter the underlying file and folder names. To ensure that these changes take place you must move the VM to another datastore, ie
- Shutdown the VM
- Rename the VM in vCenter
- Migrate the VM and move it to another Datastore
- Restart the VM
If you can't move the VM to another datastore then it gets much more complicated, requiring faffing around in the service console.
- Shutdown the VM
vmware-cmd -s unregister /vmfs/volumes/datastore/vm/vmold.vmx
mv /vmfs/volumes/datastore/vm-old /vmfs/volumes/datastore/vm-new
cd /vmfs/volumes/datastore/vm-new
vmkfstools -E vm-old.vmdk vm-new.vmdk
find . -name ‘*.vmx*’ -print -exec sed -e ‘s/vm-old/vm-new/g’ {} \;
- For every file that hasn’t been renamed (.vmsd etc.)
mv vm-old.vmx vm-new.vmx
vmware-cmd -s register /vmfs/volumes/datastore/vm-new/vm-new.vmx
The above was taxed from http://www.yellow-bricks.com/2008/02/10/howto-rename-a-vm/
Clone a VM
This can done as
- Hot clone - Source VM is left running, its disks are quiesced, and cloned. Can cause problems as new machine behaves as if it was ungracefully shutdown when first started, but normally successful. Source machine needs to be relatively quiet.
- Cold clone - Source VM is shutdown 1st, preferable to a warm clone if possible.
Snapshots and Cloning
Snapshots are deleted during a clone, in that cloning a machine that has existing snapshots results in the post-snapshot changes being merged into the new machine.
In order to retain the snaphosts, the virtual machine needs to be cloned manually (untested procedure!!)...
- Copy all of the VMs files into a new directory (using vmkfstools --nosparse option).
- Correct the .vmx file to match new paths, update VM name, and delete the UUID line (VMware will prompt to generate a new one when the VM is started).
- Register the new VM in vCentre and double check the VM is as expected.
- Power on (you'll get an IP conflict if its on the same portgroup as the original)
Shutdown VM via Service Console
- To determine state of an Virtual Machine running from the local ESX
vmware-cmd /vmfs/volumes/SAN1/ServerA/ServerA.vmx getstate
getstate() = on
- Shutdown a Virtual Machine running from the local ESX forcefully
vmware-cmd /vmfs/volumes/SAN1/ServerA/ServerA.vmx stop hard
stop(hard) = 1
Troubleshooting
See also Virtual Centre Troubleshooting
Can't Connect to VM Console
Error connecting: Cannot connect to host...
- This is caused by a TCP connection failure to the ESX server the VM is hosted on. Using telnet or a port test utility, confirm you can connect on both TCP 902 and 903 from your machine to the ESX server.
Can't Deploy VM
The VirtualCenter server is unable to decrypt passwords stored in the customization specification
- Bizarrely caused by the Virtual Centre running out of disk space, free up some space and all will be well.
A general system error occurred: Failed to create journal file provider
- Check ESX disks are not full
Can't Start VM
HA Admission Control
- Can't start VM as doing so wouldn't leave enough failover capacity in order to be able to restart failed VM's should an ESX fail. Options are to
- Reduce resource usage of VM's that are already running
- Increase cluster capacity
- Reduce the cluster's failover capacity, or allow constraints violations
- If no VM's have been recently added to the cluster, its likely that the HA agent on one of the ESX's has stopped functioning, in which case, within the cluster, one of the ESX's will have a red warning/exclamation triangle. If so you can restart HA on that ESX;
- Highlight this VM, on the Summary tab you should see a notice regarding HA problems
- Run the Reconfigure for HA command, this will re-install the HA agent on the ESX
Failed to relocate virtual machine
- DRS is attempting to relocate a VM at power up, and this relocation failing
- Reattempt to power on machine
- Manually migrate to a less loaded ESX and reattempt power on
Access to VMFS storage
- ESX may have lost connectivity to VMFS partition on which VM resides
VMFS full
- If VMFS is full, the ESX won't be able to write to the VM's logs when it starts it up, causing VM start-up to fail
ESX licensing
- Either ESX isn't licensed, or has lost contact with the license server (VI3) for a long period of time
Waiting for question to be answered
- Generally after changes (such as cold migrations or new deployments), a VM may need to have a question answered before it can continue to power on
Could not power on VM: No swap file. Failed to power on VM
- The ESX your starting the VM up on can't get proper access the VM's files, either because
- The VM is already powered up on another ESX
- The VM's files have been corrupted
- If the ESX the virtual is/was on has failed then its likely that only the ESX's network connections have failed, the virtual machines are still running on the ESX, but are isolated from the network.
- To cause a full HA failover, pull the power cables out of the ESX to kill it completely
- Alternatively, attempt to restore network connectivity to allow the VM's to br reachable again
- If there are no ESX failures, then the VM's files are probably corrupted. The VM needs to be re-registered by removing and re-adding it to the inventory.
Can't VMotion a VM
VM network doesn't exist at destination
- VM is using a particular port group which doesn’t exist on the destination ESX
ESX / network too busy
- VMotion can’t copy across VMs memory contents/changes quickly enough. An alternative is to use a Low Priory VMotion, which is more likely to succeed, but may result in the VM experiencing temporary freezes (avoids full OS downtime, but not without impact to hosted applications)
ESXs can't communicate
- ESXs need to be able to communicate via VMotion network. DNS problems and FQDN inaccuracies can also cause problems
VM is connect to CD-ROM/ISO
- VMs CD-ROM is connecting to an ISO file via the host ESX, tying it to that ESX
Can't Snapshot
Cannot create a quiesced snapshot because the create snapshot operation exceeded the time limit for holding off I/O in the frozen virtual machine
- Prevents hot-cloning or snapshot based backup of a machine, because of either of...
- VMware Tools aren't properly installed
- The machine has high transactional IO (eg Exchange, SQL, AD) and cannot pause disk access in order to create snapshot
- See the following...
Can't Commit Snapshot
Operation timed-out
- Not unusual for large (>10GB) redo files, the process continues and its just vCentre reporting it as a time-out
- Check the VM's files for any activity (changes in disk sizes), very large redo files (eg 100 GB) could take 4hrs!
- Also see Snapshot Still Active?
No Snapshots Exist in Snaphot Manager (but still exist)
- Can happen if a snapshot Delete (All) fails to complete properly (eg ESX pseudo-hangs and you restart the management agents)
- Start a new snapshot
- In snapshot manager use Delete All (not Delete!)
Snapshot Still Active?
- Check Snapshot Manager, if there's snapshots listed then there are still active snapshots
- Open up Datastore Browser to the VM's folder, and see if any snapshot files exist, if not then there are no active snapshots
- Check the VM's VMX file, the VMDK filename(s) will be either a snapshot or normal flat base disk file
- EG
scsi0:0.fileName = "MyVM-000001.vmdk"
←←←←← Snapshot file (snapshot running) - EG
scsi0:0.fileName = "MyVM-000001-delta.vmdk"
← Snapshot file (snapshot running) - EG
scsi0:0.fileName = "MyVM.vmdk"
←←←←←←←←← Base disk file (no snapshot running) - EG
scsi0:0.fileName = "MyVM-flat.vmdk"
←←←←←← Base disk file (no snapshot running)
- EG
- If there's no snapshots running, but snapshot files exist then the files can be deleted (if you're sure!)
Can't Customise
Windows setup could not configure Windows to run on this computer's hardware
Windows could not complete the installation. To install Windows on this computer, restart the installation.
- The guest customisation is failing because either
- The virtual hardware has changed (especially disk type) since the original machine was created
- Sysprep can't customise the machine because it doesn't have administrator rights, this can occur where a DC's users have been offloaded to LDS