Difference between revisions of "Installation (ESX)"

Jump to navigation Jump to search
1,723 bytes added ,  11:14, 26 October 2010
→‎High Availability: Added "Error Hints"
(→‎High Availability: Added "Command Line Interface etc" and beefed up other sections)
(→‎High Availability: Added "Error Hints")
Line 948: Line 948:
* Primary reconfigured for HA
* Primary reconfigured for HA


It's quite common for HA to go into an error state, normal course of action is to use the '''Reconfigure for HA''' option for the ESX that's experiencing the problem.  This reinstalls the HA agent onto the ESX onto the ESX.  It's also common to have to do this a couple of times for it to be successful.  
It's quite common for HA to go into an error state, normal course of action is to use the '''Reconfigure for HA''' option for the ESX that's experiencing the problem.  This reinstalls the HA agent onto the ESX onto the ESX.  It's also common to have to do this a couple of times for it to be successful. Other things to try...
* Restart the HA process - see [[#High_Availability_Stop.2FStart|High Availability Stop/Start]]
* [[#Manually Deinstall|Deinstall HA and VPXA]] and reinstall


HA is very dependant on proper DNS, to check everything is in order do the following from each ESX
HA is very dependant on proper DNS, to check everything is in order do the following from each ESX. Some versions of ESX3 are sensitive to case, always user lower, FQDN of ESX's should be lower case, and VC's FQDN and domain suffix search should be lower case
# Check that the hostname of the local ESX is as expected
# Check that the hostname/IP of the local ESX is as expected
#* <code> hostname </code>
#* <code> hostname </code>
#* <code> hostname -s </code>
#* <code> hostname -i </code>
#* If not check the following files
#** <code> /etc/hosts </code>
#** <code> /etc/sysconfig/network </code>
#** <code> /etc/vmware/esx.conf </code>
# Check that HA can properly resolve other ESX's in the cluster (note: only one IP address should be returned)
# Check that HA can properly resolve other ESX's in the cluster (note: only one IP address should be returned)
#* <code> /opt/vmware/aam/bin/ft_gethostbyname <my_esx_name> </code>
#* <code> /opt/vmware/aam/bin/ft_gethostbyname <my_esx_name> </code>
# Check that HA can properly resolve the vCentre
#* <code> /opt/vmware/aam/bin/ft_gethostbyname <my_vc_name> </code>
# Check the vCentre server can properly resolve the ESX names
# Check the vCentre's FQDN and DNS suffix search are correct and lower case


Other things to try...
If you need to correct DNS names, don't be surprised if you need to reinstall HA and VPXA, it can be done without interrupting running VM's, but its obviously a lot less stressful not to.
* Restart the HA process - see [[#High_Availability_Stop.2FStart|High Availability Stop/Start]]
* Deinstall HA - see below


=== Manually Deinstall ===
=== Manually Deinstall ===
Line 997: Line 1,007:
* <code> ./ft_startup </code>
* <code> ./ft_startup </code>
* <code> ./ft_shutdown </code>
* <code> ./ft_shutdown </code>
=== Error Hints ===
'''Host in HA Cluster must have userworld swap enabled'''
* ESXi servers need to have scratch space enabled
# In vCentre, go to the '''Advanced Settings''' of the ESX
# Go to '''ScratchConfig''' and locate <code>ScratchConfig.ConfiguredScratchLocation </code>
# Set to directory with sufficient space (1GB) (can be configured on local storage or shared storage, folder must exist and be dedicated to ESX, delete contents if you've rebuilt the ESX)
#* Format <code> /vmfs/volumes/<DatastoreName> </code>
#* EG <code> /vmfs/volumes/SCRATCH-DISK/my_esx </code>
#* Locate <code> ScratchConfig.ConfiguredSwapState </code> and set
# Bounce the ESX
'''Unable to contact primary host in cluster'''
* The ESX is unable to contact a primary ESX in cluster, some kind of networking issue
** If there's no existing HA'ed ESX's, start by looking at VC networking


== Snapshots ==
== Snapshots ==

Navigation menu