Procedures (ESX)

From vwiki
Jump to navigation Jump to search

Links to VMware KB docs...

Quick commands

vmware -v ESX3 software version and build
vmware -l ESX4 software version and build
vm-support -x List running VM's
vmware-cmd -l List config files of VM's registered to ESX
esxcfg-rescan vmhba0 Perform LUN rescan on vmhba0
esxcfg-vmhbadevs List HBA LUN mappings
esxcfg-scsidevs --hbas List HBA devices
esxcfg-mpath -l List all LUNS and their paths

ESX Shutdown / Reboot

ESX

  • Shutdown a host ready for power off
    • shutdown -h now
  • Restart a host
    • shutdown -r now

ESXi

  • Shutdown a host ready for power off, either of
    • /bin/host_reboot.sh
    • reboot
  • Restart a host
    • /bin/host_shutdown.sh

High Availability Stop/Start

  • Stop HA...
    • /etc/init.d/VMWAREAAM51_vmware stop
  • Start HA...
    • /etc/init.d/VMWAREAAM51_vmware start

VMware Management Agent Restart

ESX

service mgmt-vmware restart
Stopping VMware ESX Server Management services:
  VMware ESX Server Host Agent Services                   [  OK  ]
  VMware ESX Server Host Agent Watchdog                   [  OK  ]
  VMware ESX Server Host Agent                            [  OK  ]
Starting VMware ESX Server Management services:
  VMware ESX Server Host Agent (background)               [  OK  ]
  Availability report startup (background)                [  OK  ]

If this fails to stop the service, you can try to manually kill the processes.

  1. Determine the PID's of the processes
    • ps -auxwww | grep vmware-hostd
    • which should give you something like, in which case the PID's are 2807 and 2825...
    • root 2807 0.0 0.3 4244 884 ? S Mar10 0:00 /bin/sh /usr/bin/vmware-watchdog -s hostd -u 60 -q 5 -c /usr/sbin/vmware-hostd-support /usr/sbin/vmware-hostd -u
    • root 2825 0.1 12.0 72304 32328 ? S Mar10 1:14 /usr/lib/vmware/hostd/vmware-hostd /etc/vmware/hostd/config.xml -u
    • root 13848 0.0 0.2 3696 556 pts/0 R 08:43 0:00 grep vmware-hostd
  2. Kill the PID's using kill -p pid
    • So, for example, kill -9 2807 and kill -9 2825
  3. Then reattempt the service restart

To also restart the Virtual Centre Agent, use

service vmware-vpxa restart

ESXi
services.sh restart

VMware Web Access Restart

service vmware-webAccess restart
Stopping VMware ESX Server webAccess:
   VMware ESX Server webAccess                             [FAILED]
Starting VMware ESX Server webAccess:
   VMware ESX Server webAccess                             [  OK  ]

VM Start

On the ESX that currently owns the VM...

  1. Get the VM's config file path
    • vmware-cmd -l | grep VM_Name
  2. Start the VM using the path found
    • vmware-cmd \vm_path\VM_Name.vmx start
  3. Wait for start-up to complete, if start-up fails check the VM's log
    • less \vm_path\vmware.log

Maintenance Mode

To put the ESX into maintenance mode with no access from the Infrastructure Client (VCP) use the following commands - use with caution

Put esx into maintenance mode:

vimsh -n -e /hostsvc/maintenance_mode_enter

check the esx is in maintenance mode

vimsh -n -e /hostsvc/runtimeinfo | grep inMaintenanceMode | awk ‘{print $3}’

exit maintenance mode

vimsh -n -e /hostsvc/maintenance_mode_exit


TCPDump Network Sniffer

Basic network sniffer available in Service Console

TCPDump instruction manual

EG To sniff all traffic on the Service Console interface, vswif0, going to/from 159.104.227.40

tcpdump -i vswif0 host 159.104.224.70


Security

Password Complexity Override

In order to be able to change a user (or root) password to one that breaches password complexity checking

  1. Disable PAM module
    • esxcfg-auth --usepamqc -1 -1 -1 -1 -1 -1
  2. Disable complexity checker
    • esxcfg-auth --usecrack -1 -1 -1 -1 -1 -1
  3. Change password
  4. Re-enable PAM module
    • esxcfg-auth --usepamqc=-1 -1 -1 -1 8 8

Regenerate Certificate

You might need to regenerate certificates if

  • Change ESX host name
  • Accidentally delete the certificates

To generate new Certificates for the ESX Server host...

  1. Change directories to /etc/vmware/ssl.
  2. Create backups of any existing certificates:
    • mv rui.crt orig.rui.crt
    • mv rui.key orig.rui.key
  3. Rstart the vmware-hostd process:
    • service mgmt-vmware restart
  4. Confirm that the ESX Server host generated new certificates by executing the following command comparing the time stamps of the new certificate files with orig.rui.crt and orig.rui.key
    • ls -la


NIC Operations

Get NIC Firmware/Driver versions

  • ESX4
    • ethtool -i vmnic<no>
    • Where <no> is your NIC no, eg ethtool -i vmnic0
  • ESX3i / ESX4i
    • vsish -e get net/pNics/vmnic<no>/properties
    • Where <no> is your NIC no, eg vsish -e get net/pNics/vmnic1/properties

Display ARP Cache

  • ESX
    • arp -a
  • ESXi
    • esxcli network neighbor list

HBA and SAN Operations

VMFS / LUN Addition

The new LUN needs to be carved up and presented to all ESX's that should see it (normally all ESX's from a particular cluster). Once completed, follow the procedure below to add to the ESX's...

  1. Pick ESX in cluster with lowest load
  2. Go to Storage Adapters, hit Rescan... and untick the Scan for New VMFS Volumes
  3. Once scan has complete, go to Storage, and hit Add Storage...
  4. Click Next > to select Disk/LUN storage
  5. Select the appropriate device and click Next >
  6. Check the current disk layout (ie its blank if its meant to be) and click Next >
  7. Give the datastore an appropriate name, and click Next >
  8. Select an approriate block size (this limits maximum VMDK size), and click Next >
  9. Review config and click Finish
  10. On the remaining ESX's, go to Storage Adapters, hit Rescan... (leave both boxes checked)

SAN LUN ID

The SAN LUN ID is used by SAN admin's to identify LUN's. It's not readily available from the GUI and has to be extracted from the vml file...

So from the following...

  • /vmfs/devices/disks/vml.020006000060060160c6931100cc319eea7adddd11524149442035

you need to extract the mid characters from the vml name...

  • /vmfs/devices/disks/vml.020006000060060160c6931100cc319eea7adddd11524149442035

So the SAN LUN ID is 60060160c6931100cc319eea7adddd11

Emulex

Find Emulex HBA Driver and Firmware Version, and WWPN

Doesn't require Emulex HBA utility to be installed

  1. cd /proc/scsi/lpfc
  2. more 1 for HBA 1
  3. more 2 for HBA 2

The Portname number is the WWPN number used to identify the HBA's by the SAN.

[root@uklonesxp2 lpfc]# more 1
Emulex LightPulse FC SCSI 7.1.14_vmw1
Emulex LightPulse LP1050 2 Gigabit PCI Fibre Channel Adapter on PCI bus 0f devic
e 20 irq 121
SerialNum: BG70569148
Firmware Version: 1.91A1 (M2F1.91A1)
Hdw: 1001206d
VendorId: 0xf0a510df
Portname: 10:00:00:00:c9:61:73:de   Nodename: 20:00:00:00:c9:61:73:de

Link Up - Ready:
   PortID 0x645213
   Fabric
   Current speed 2G

Install Emulex HBA Utility

Can be found at Emulex Lputil.

To install lputil (uses example of lpfcutil-7.1.14;

  1. Put the downloaded tgz file on the ESX server
    • EG mkdir /var/updates/Emulex-lpfcutil-7.1.14
  2. Go into folder and extract;
    • cd /var/updates/Emulex-lpfcutil-7.1.14/
    • tar -xvzf Emulex-lpfcutil-7.1.14.tgz
  3. Install;
    • ./Install.sh
[root@esx2 Emulex-lpfcutil-7.1.14]# ./Install.sh
Installing Emulex HBAAPI libraries and applications...
Installation of Emulex HBAAPI libraries and utilities is completed.
  • Start the utility (on startup it should detect one or more HBA's);
  • /usr/sbin/lpfc/lputil
LightPulse Common Utility for Linux. Version 1.6a10 (10/7/2004).
Copyright (c) 2004, Emulex Network Systems, Inc.

Emulex Fibre Channel Host Adapters Detected: 1
Host Adapter 0 (lpfc0) is an LP1050 (Ready Mode)

HBAnywhere Installation

  1. Download the Driver and Application kit for VMware from Emulex's website.
    • At time of writing the current version of package was elxvmwarecorekit-esx35-4.0a45-1.i386.rpm
  2. Copy the package to the server
    • EG pscp -pw [password] elxvmwarecorekit-esx35-4.0a45-1.i386.rpm platadmn@dtcp-esxsvce01a:/home/platadmn
  3. Install the package
    • EG rpm -ivh elxvmwarecorekit-2.1a42-1.i386.rpm

Check Emulex HBA Firmware Version

Requires the HBA Utility to be installed 1st (see above)

  1. Start the utility (on startup it should detect one or more HBA's;
    • /usr/sbin/lpfc/lputil
  2. From the Main menu, enter 2, Adapter Revision Levels
    • Example shows version 1.91a5
                   BIU: 1001206D
      Sequence Manager: 00000000
                 Endec: 00000000
  Operational Firmware: SLI-2 Overlay
                Kernel: 1.40a3
      Initial Firmware: Initial Load 1.91a5 (MS1.91A5 )
                 SLI-1: SLI-1 Overlay 1.91a5 (M1F1.91A5 )
                 SLI-2: SLI-2 Overlay 1.91a5 (M2F1.91A5 )
 Highest FC-PH Version: 4.3
  Lowest FC-PH Version: 4.3


Update Emulex HBA Firmware

  • Using HBA Utility (must be installed 1st - see above). See the Emulex website for the latest version, eg Emulex LP1050Ex

To update the firmware (example uses LP1050Ex-mf191a5)

  1. Downloaded the zip file, and unzip to a folder (eg EmulexLP1050Ex-mf191a5)
  2. Create folder in /var/updates;
    • mkdir /var/updates/EmulexLP1050Ex-mf191a5
  3. Copy the firmware update onto the ESX
    • cp EmulexLP1050Ex-mf191a5/mf191a5.all /var/updates/EmulexLP1050Ex-mf191a5/
  4. Start the utility (on startup it should detect one or more HBA's;
    • /usr/sbin/lpfc/lputil
  5. From the Main menu, enter 3, Firmware Maintenance.
  6. If prompted, choose the HBA that is being updated.
  7. Enter 1, Load Firmware Image.
  8. Enter the full path to the firmware file, upgrade will then complete, eg
Enter Image Filename => /var/updates/EmulexLP1050Ex-mf191a5/mf191a5.all
Opening File...
End Of File
Checksum OK!!!
Reading AIF Header #1...
Validating Checksum...
Erasing Flash ROM Sectors...
100% complete
Loading Image...
First Download
100% complete
Image Successfully Downloaded...
Reading AIF Header #2...
Validating Checksum...
Erasing Flash ROM Sectors...
100% complete
Loading Image...
First Download
100% complete
Updating Wakeup Parameters...
Image Successfully Downloaded...
Reading AIF Header #3...
End Of File
Resetting Host Adapter...
Image Successfully Downloaded...


  • Using HBAnywhere (must be installed 1st - see above)
  1. Download the correct firmware version from Emulex's website
  2. Extract, and copy file to server
  3. Find adapter's WWPN's
    • EG /usr/sbin/hbanyware/hbacmd ListHBAs
  4. Download new firware version to each HBA
    • EG /usr/sbin/hbanyware/hbacmd download 10:00:00:00:c9:82:97:9e zf280a4.all

EMCgrab Collection

  1. Download correct verion from EMC's website
  2. Copy to server
    • EG pscp emcgrab_ESX_v1.1.tar platadmn@dtcp-esxsvce02a:/home/platadmn
  3. Uncompress the file
    • EG tar -xvf emcgrab_ESX_v1.1.tar
  4. Run grab (can take a few minutes, best done out of hours)
    • EG ./emcgrab.sh
  5. Results can be found in \emcgrab\outputs folder

QLogic

Find QLogic HBA Driver and Firmware Version

  1. cd /proc/scsi/qla2300
  2. more 1 for HBA 1
[root@esx1 qla2300]# more 1
QLogic PCI to Fibre Channel Host Adapter for QLA2340 :
        Firmware version:  3.03.19, Driver version 7.07.04
Entry address = 0x7dc314
HBA: QLA2312 , Serial# E79916
Request Queue = 0x3f403000, Response Queue = 0x3f414000
...


Install QLogic HBA Utility

Installation instructions for the SANsurfer utility

  1. Copy the downloaded tgz file (eg scli-1.7.0-12.i386.rpm.gz) to folder /var/updates (create if it doesn't exist)
    • cp scli-1.7.0-12.i386.rpm.gz /var/updates
  2. Uncompress the file with the following command;
    • gunzip scli-1.7.0-12.i386.rpm.gz
  3. Enter the following commands to install the package, and then check its installed;
    • rpm -iv scli-1.7.0-12.i386.rpm
    • rpm -q scli
[root@uklonesxp1 updates]# rpm -iv scli-1.7.0-12.i386.rpm
Preparing packages for installation...
scli-1.7.0-12
[root@uklonesxp1 updates]# rpm -q scli
scli-1.7.0-12


Update QLogic HBA Firmware

See QLogic website for latest version, you must ensure the firmware version is compatible with the current running driver version. Requires SANsurfer to be installed 1st (see above)

  1. Put the downloaded tgz file on a NFS Share, eg q231x_234x_bios147.zip, and unzip to folder
  2. Create a new folder for the update;
    • mkdir /var/updates/q231x_234x_bios147
  3. Copy the firmware onto the ESX server;
    • cp q231x_234x_bios147/QL23ROM.BIN /var/updates/q231x_234x_bios147/
  4. Move to the folder containing the update;
    • cd /var/updates/q231x_234x_bios147/
  5. Start the SANsurfer utility
    • scli
  6. Go into the HBA Utilities option
  7. Select the 'Save Flash option
  8. Follow the prompts to save the flash to a backup file, eg BackupROM.bin
  9. Select the Update Flash option
  10. Follow the prompts to update the flash, using the file copied to the ESX, eg QL23ROM.BIN
Enter a file name or Hit <RETURN> to abort: QL23ROM.BIN
Updating flash on HBA 0 - QLA2340 . Please wait...
Option ROM update complete. Changes have been saved to the HBA 0.
Please reboot the system for the changes to take effect.
Updating flash on HBA 1 - QLA2340 . Please wait...
Option ROM update complete. Changes have been saved to the HBA 1.
Please reboot the system for the changes to take effect.


SAN Downtime

ESX's don't like to loose the SAN, to the extent that during the scheduled SAN downtime the following is recommended...

  1. Shutdown ESX's (and hosted VM's) connected to affected storage
  2. Perform SAN maintenance
  3. Restart ESX's (and hosted VM's)

If the above is not possible then its recommended that...

  1. Migrate away/shutdown VM's that are hosted on affected storage
  2. Un-present LUN's
  3. Resan LUN's from ESX and confirm they disappear (any VM's on extinct storage will become greyed-out)
  4. Perform SAN maintenance
  5. Re-present LUN's
  6. Re-scan LUN's from ESX and confirm that they re-appear (grey-ed out VM's should reconnect)
  7. Restart / migrate VM's

Netflow

Netflow is available on ESX v3 only, and is an experimental feature. Netflow v5 is sent.

  • To start Netflow
    1. Load the module
      • vmkload_mod netflow
    2. Configure monitoring of appropriate vSwitch's to Netflow collector IP and port
      • /usr/lib/vmware/bin/vmkload_app -S -i vmktcp /usr/lib/vmware/bin/net-netflow -e vSwitch0,vSwitch1 10.20.255.31:2055
    • To reconfigure the Netflow module you must stop and restart the module
  • To confirm running
    1. Check the module is running...
      • [root@esx1 root]# vmkload_mod -l | grep netflow
      • netflow 0x9b4000 0x3000 0x298b640 0x1000 16 Yes
    2. Check the correct config is running...
      • [root@esx1 root]# ps -ef | grep netflow
      • root 2413 1 0 Feb05 ? 00:00:00 /usr/lib/vmware/bin/vmkload_app -S -i vmktcp /usr/lib/vmware/bin/net-netflow -e vSwitch0,vSwitch1 10.20.255.31:2055
  • To stop Netflow
    1. ps -ef | grep netflow
    2. kill <pid>
    3. vmkload_mod -u netflow

Change Service Console IP Information

Logged in as root use the esxcfg-vswif command esxcfg-vswif <options> [vswif]

Description: Creates and updates service console network settings. This command is used if you cannot manage the ESX Server host through the VI Client because of network configuration issues.

Note that the -l command will display the names(s) of the virtual switches which must be specified on the other commands so the trailing [vswif] is not optional on most commands.

Options:

-a Add vswif, requires IP parameters. Automatically enables interface. -d Delete vswif. -l List configured vswifs. -e Enable this vswif interface. -s Disable this vswif interface. -p Set the portgroup name of the vswif. -i <x.x.x.x> or DHCP The IP address for this vswif or specify DHCP to use DHCP for this address. -n <x.x.x.x> The IP netmask for this vswif. -b <x.x.x.x> The IP broadcast address for this vswif. (not required if netmask and ip are set) -c Check to see if a virtual NIC exists. Program outputs a 1 if the given vswif exists, 0 otherwise. -D Disable all vswif interfaces. (WARNING: This may result in a loss of network connectivity to the Service Console) -E Enable all vswif interfaces and bring them up. -r Restore all vswifs from the configuration file. (Internal use only) -h Displays command help.

Note: You set the Service Console default gateway by editing the /etc/sysconfig/network file or through the VI Client under Configuration, DNS & Routing.

Note: You set the Service Console VLAN (to 1234) using a similar command to: esxcfg-vswitch -v1234 -p"Service Console" vSwitch0>

Change Timezone

  1. Log into the ESX Server service console as root.
  2. Find the desired time zone under the directory /usr/share/zoneinfo
  3. Edit /etc/sysconfig/clock Edit this file to show the relative path to the file representing the new time zone, and ensure that UTC and ARC are set as shown:
    • ZONE="Etc/GMT"
    • UTC=true
    • ARC=false
  4. Copy the desired time zone file to /etc/localtime
    • cp /usr/share/zoneinfo/GMT /etc/localtime
  5. Confirm that /etc/localtime has been updated with the correct zoneinfo data using the following steps:
  6. Reference the zoneinfo file used in step 2 and compare it to /etc/localtime, if the files are identical, your prompt will return without any output.
    • diff /etc/localtime /usr/share/zoneinfo/GMT
  7. Confirm the system and hardware clocks are correct. Use the Linux date command to check and set the correct time if necessary.
    • Set the hardware clock to match the correct system time.
    • Set the system clock to the local date and time: \\\\ date MMDDhhmmYYYY
  8. Update the hardware clock with current time of the system clock;
    • /sbin/hwclock --systohc