Troubleshooting (vCentre)

From vWiki
Jump to navigation Jump to search

VMware VirtualCenter Server service won't start

Service-specific error 2 (0x2)

  • Caused by the SQL service being unavailable, therefore investigate why this is so.

If SQL is running on the same server, and the service failed after a reboot, its likely that its starting too quickly and SQL isn't ready. For starters make the service depend on SQL and SQL Agent services, failing that, make the vCentre service start in a delayed fashion.

To make the VirtualCentre Server service depend on the SQL service

  1. Find the name of the SQL service
    • Find the MSSQL and SQLAgent keys in the following hive
    • HKLM\System\CurrentControlSet\Services
    • Could be be MSSQL, MSSQL$SQLEXPRESS, or if you've used a named instance, something like MSSQL$VIM
  2. Make VC service dependant on it
    • Browse to HKLM\System\CurrentControlSet\Services\vpxd
    • Add the name of the SQL service to the DependOnService value (there must be a blank line at the end still)

Interaction between vCentre and SQL can be quite poor when it comes to its start-up behaviour...

  • SQL tends to report itself as started, despite the fact that it hasn't made its database instances available yet.
  • vCentre will try to connect to SQL, then fail if it can't get in straight away

...meaning that you end up being reliant the vCentre service restarting in order for it to be able to connect and start up - which is far from ideal for normal operation.

Virtual Machine won't export

Generic Workarounds

  • Download the VM files from the datastore, this can then be uploaded to the intended destination and imported.
    • Remove any unnecessary log files
  • Use the OVF Tool, this can succeed where vCentre fails, either
    • Convert locally downloaded VM files into an OVF
    • Connect to storage via the vCentre and create an OVF from there

Failed to Export Virtual Appliance: An item with the same key has already been added

  • Caused by VM being exported having a running snapshot.

To resolve...

  • Delete / Consolidate VM's snapshot(s)

Failed to export virtual appliance: Unexpected meta section

  • Sometimes the exported doesn't handle creating a local folder to export to properly
  • Discrepancy between VMX file and actual virtual hardware config
  • Snapshot files (especially VMSD) may remain despite there being no active snapshot

To resolve...

  • Re-attempt
  • Backup and then delete any snapshot files (assuming you're sure there are no active snapshots - see Snapshot Still Active?), .vmsd and *-0000x.vmdk files
  • Check VMX file for any discrepancies between itself and reality, backup and then correct the VMX file

System Error

Call "PropertyCollector.RetrieveContents" for object "propertyCollector" on vCenter Server

  • Occurs when you try to perform a task on a VM (or possibly any vCentre object), seems to be due to corruption or conflicts within the vCentre database

To resolve...

  • Re-add the VM to the inventory
    1. Ensure you know the following VM info; datastore, folder, resource pool, anything else!
    2. select the VM and Remove from Inventory
    3. Locate the .vmx file in the datastore and Add to Inventory'

Can't Remove Inaccessible Virtual Machine

VM no longer exists, or is no longer required, but can't remove from Virtual Centre as its in inaccessible state, and powered on.

  1. Via SSH to the ESX that the VM belongs to, find its config file
    • vmware-cmd -l | grep My_VM
  2. Unregister the VM from the owning ESX
    • vmware-cmd -s unregister /path/to/My_VM.vmx
    • Should return something like unregister(/path/to/My_VM.vmx) = 1
  3. VM can now be unregistered from the VC

Orphaned VM

VM's appear orphaned if they're in the vCentre database, but are no longer being reported by the ESX to vCentre. Its likely that its still running (or has been restarted by HA).

  1. Locate the VM (if there's been an HA failover its probably not where vCentre believes)
    • Connect a VI Client direct to each ESX, if the VM isn't shown, locate as follows
      1. SSH to an ESX, and cd to the VM's folder
      2. Run the following command
        • vmkfstools -D *.vmx
      3. The owning ESX's MAC address is shown in the end of the owner field in the following part of the return data
        • gen 443, mode 1, owner 4d0f65e4-d55ab729-0ea0-d48564638518 mtime 8831379]
      4. Run the following command on each ESX to find the server with that MAC address
        • esxcfg-nics -l
  2. If the VM is shown on the ESX's VI Client, restart the ESX's management agent.
  3. If the VM is not shown on the ESX's VI Client, browse the datastore and register the VM

VI Client Slow on Windows 7

Screen redraws can sometimes get quite slow and irritating, disabling desktop composition (and so also the Aero/see-through desktop features) helps to alleviate this.

  1. Go to the Properties of the shortcut used to open the VI Client
  2. In the Computability tab, tick Disable desktop composition
  3. Restart any running VI Client instances

Note that the above have now been fixed in vSphere Client v4.1 Update 2 (build 491557) - even if you haven't upgraded you ESX or vCentre servers you can upgrade your client, find the Client download in the ESX update area - http://downloads.vmware.com/d/details/esx41u2/dHdlYnRoKmRidGRkKg==

VI Client Console White on Windows 7

When trying to open a VM console to VM hosted on a v3 ESX, the screen is white, viewing ESX4 hosted VM's is fine. Problem appears to be due to incomparability with .NET, which is expose if you install VI Clients in a particular order, to resolve...

  1. Uninstall all versions of VI Client
  2. Install the following in this order…
    1. VI Client v2.5 update 6 (build 227637)
    2. VI Client v4.0 update 2 (build 258672)
    3. VI Client v4.1 (build 258902)

If you don't have VC's with those build numbers available you'll need to source the VI Client installers, either from old vCentre install packages or direct from the VMware download site (you can download individual VI Client installers rather than the whole vCentre package

com.vmware.converter Error

In vCentre v4.1, you can get a "Health status monitoring" alert for the vCentre server, which from the vCentre Service Status page shows to be caused by an error with com.vmware.converter. Browsing to the server's page (eg https://vc-server/converter/health.xml) shows the Converter service status to be green.

  1. Run ldp.exe (available in Win2k8, needs to be downloaded from Microsoft for Win2k3)
  2. Go to Connection | Connect... and enter the hostname of the local vCentre Server
  3. Go to Connection | Bind... and enter user/pass if required or just leave Bind as currently logged in user
  4. Go to View | Tree, leave the Base DN blank and click OK
  5. Double click through the following to get through to the vmw-vc-SSLThumbprint...
    1. DC=virtualcenter,DC=vmware,DC=int
    2. OU=Health,DC=virtualcenter,DC=vmware,DC=int
    3. OU=ComponentSpecs,OU=Health,DC=virtualcenter,DC=vmware,DC=int
    4. CN=<GUID>,OU=ComponentSpecs,OU=Health,DC=virtualcenter,DC=vmware,DC=int
    5. CN=<GUID>.vpxd,CN=<GUID>,OU=ComponentSpecs,OU=Health,DC=virtualcenter,DC=vmware,DC=int
    6. ... copy thumbprint value from right-hand pane to notepad
  6. Double click through the following to get through to the vmw-vc-SSLThumbprint value stored by Converter health...
    1. DC=virtualcenter,DC=vmware,DC=int
    2. OU=Health,DC=virtualcenter,DC=vmware,DC=int
    3. OU=ComponentSpecs,OU=Health,DC=virtualcenter,DC=vmware,DC=int
    4. CN=<GUID>,OU=ComponentSpecs,OU=Health,DC=virtualcenter,DC=vmware,DC=int
    5. CN=com.vmware.converter,CN=<GUID>,OU=ComponentSpecs,OU=Health,DC=virtualcenter,DC=vmware,DC=int
  7. Backup the existing vmw-vc-SSLThumbprint entry to notepad
  8. Replace the existing entry...
    1. Right-click over CN=com.vmware.converter,CN=<GUID>,OU=ComponentSpecs,OU=Health,DC=virtualcenter,DC=vmware,DC=int and hit Modify
    2. In Attribute enter vmw-vc-SSLThumbprint
    3. In Values paste in the 1st value saved in notepad (strip the trailing ;)
    4. In Operation make sure 'Replace is selected
    5. Click Enter and then Run, then Close

Its takes a while (5 - 10 mins) for the fix to take effect, reboot the vCentre server if you're feeling impatient.

Resource Pool Corruption

Can manifest itself in various ways, you might have trouble re-registering VM's, or be unable to completely delete a resource. The corruption actually occurs at the ESX, from which you might see an error message similar to Error during the configuration of the host. Can not delete non-empty group: pool<n>.

  1. Delete the resource pool from the vCentre interface (if not already done)
  2. SSH to the ESX and delete the /etc/vmware/hostd/pools.xml folder
  3. Restart the management agent
    • services.sh restart
  4. Resource pools info get refreshed to the ESX after a min or so