2,187
edits
(→High Availability: Added "Command Line Interface etc" and beefed up other sections) |
(→High Availability: Added "Error Hints") |
||
Line 948: | Line 948: | ||
* Primary reconfigured for HA | * Primary reconfigured for HA | ||
It's quite common for HA to go into an error state, normal course of action is to use the '''Reconfigure for HA''' option for the ESX that's experiencing the problem. This reinstalls the HA agent onto the ESX onto the ESX. It's also common to have to do this a couple of times for it to be successful. | It's quite common for HA to go into an error state, normal course of action is to use the '''Reconfigure for HA''' option for the ESX that's experiencing the problem. This reinstalls the HA agent onto the ESX onto the ESX. It's also common to have to do this a couple of times for it to be successful. Other things to try... | ||
* Restart the HA process - see [[#High_Availability_Stop.2FStart|High Availability Stop/Start]] | |||
* [[#Manually Deinstall|Deinstall HA and VPXA]] and reinstall | |||
HA is very dependant on proper DNS, to check everything is in order do the following from each ESX | HA is very dependant on proper DNS, to check everything is in order do the following from each ESX. Some versions of ESX3 are sensitive to case, always user lower, FQDN of ESX's should be lower case, and VC's FQDN and domain suffix search should be lower case | ||
# Check that the hostname of the local ESX is as expected | # Check that the hostname/IP of the local ESX is as expected | ||
#* <code> hostname </code> | #* <code> hostname </code> | ||
#* <code> hostname -s </code> | |||
#* <code> hostname -i </code> | |||
#* If not check the following files | |||
#** <code> /etc/hosts </code> | |||
#** <code> /etc/sysconfig/network </code> | |||
#** <code> /etc/vmware/esx.conf </code> | |||
# Check that HA can properly resolve other ESX's in the cluster (note: only one IP address should be returned) | # Check that HA can properly resolve other ESX's in the cluster (note: only one IP address should be returned) | ||
#* <code> /opt/vmware/aam/bin/ft_gethostbyname <my_esx_name> </code> | #* <code> /opt/vmware/aam/bin/ft_gethostbyname <my_esx_name> </code> | ||
# Check that HA can properly resolve the vCentre | |||
#* <code> /opt/vmware/aam/bin/ft_gethostbyname <my_vc_name> </code> | |||
# Check the vCentre server can properly resolve the ESX names | |||
# Check the vCentre's FQDN and DNS suffix search are correct and lower case | |||
If you need to correct DNS names, don't be surprised if you need to reinstall HA and VPXA, it can be done without interrupting running VM's, but its obviously a lot less stressful not to. | |||
=== Manually Deinstall === | === Manually Deinstall === | ||
Line 997: | Line 1,007: | ||
* <code> ./ft_startup </code> | * <code> ./ft_startup </code> | ||
* <code> ./ft_shutdown </code> | * <code> ./ft_shutdown </code> | ||
=== Error Hints === | |||
'''Host in HA Cluster must have userworld swap enabled''' | |||
* ESXi servers need to have scratch space enabled | |||
# In vCentre, go to the '''Advanced Settings''' of the ESX | |||
# Go to '''ScratchConfig''' and locate <code>ScratchConfig.ConfiguredScratchLocation </code> | |||
# Set to directory with sufficient space (1GB) (can be configured on local storage or shared storage, folder must exist and be dedicated to ESX, delete contents if you've rebuilt the ESX) | |||
#* Format <code> /vmfs/volumes/<DatastoreName> </code> | |||
#* EG <code> /vmfs/volumes/SCRATCH-DISK/my_esx </code> | |||
#* Locate <code> ScratchConfig.ConfiguredSwapState </code> and set | |||
# Bounce the ESX | |||
'''Unable to contact primary host in cluster''' | |||
* The ESX is unable to contact a primary ESX in cluster, some kind of networking issue | |||
** If there's no existing HA'ed ESX's, start by looking at VC networking | |||
== Snapshots == | == Snapshots == |