Troubleshooting (Ubuntu): Difference between revisions

From vwiki
Jump to navigation Jump to search
(Initial creation - content from Ubuntu page)
 
(Removed GoogleAdLinkUnitBanner)
 
(13 intermediate revisions by the same user not shown)
Line 1: Line 1:
= Network =
'''For performance problems related load, see [[High_System_Load_(Ubuntu)|High System Load]]'''
== No NIC ==
 
== Network ==
=== No NIC ===
Especially after hardware changes, its possible the networking config no longer refers to the right interface.
Especially after hardware changes, its possible the networking config no longer refers to the right interface.


Line 6: Line 8:
# Use <code> dmesg | grep -i eth </code> to ascertain what's been detected at boot time
# Use <code> dmesg | grep -i eth </code> to ascertain what's been detected at boot time
# Assuming it states that say <code>eth0</code> has been changed to <code>eth1</code> then just update the <code>/etc/network/interfaces</code> file
# Assuming it states that say <code>eth0</code> has been changed to <code>eth1</code> then just update the <code>/etc/network/interfaces</code> file
# Alternatively, force the ''new'' NIC to be <code>eth0</code> by editing the <code>/etc/udev/rules.d/70-persistent-net.rules</code> file
#* You'll need to reboot the server for changes to take effect
== File System ==
=== Unable to Mount CD-ROM ===
Mounting drive with following command fails
* <code> mount /dev/cdrom /media/cdrom/ </code>
If <code>/media/cdrom/</code> doesn't exist
# Create the file with <code>mkdir /media/cdrom</code>
If <code>/dev/cdrom</code> special device doesn't exist
# Check for existing mappings and devices
#* <code>ls -l /dev/ | grep cdrom</code>
# If an existing mapping exists but for a different drive number (eg <code>cdrom2 -> sr0</code>)
#* Then try mounting with that number
#* EG <code> mount /dev/cdrom2 /media/cdrom/ </code>
# If no existing mapping exists
#* Then try creating one for one of the listed devices
#* EG <code> ln -sf /dev/sg0 /dev/cdrom </code>


= Software RAID =
=== Replacing a Software RAID 1 Disk ===
== Replacing a RAID 1 Disk ==
This procedure was written from the following starting point...
This procedure was written from the following starting point...
* A machine originally with two disks in RAID1 has failed, one disk has been replaced, and machine started again
* A machine originally with two disks in RAID1 has failed, one disk has been replaced, and machine started again
Line 26: Line 46:
#* <code> more /proc/mdstat </code>
#* <code> more /proc/mdstat </code>


= SSH =
=== Recover Deleted Files ===
== Server Hostname Change ==
Ideally you should recover files to a seperate disk partition to the one you are attempting to recover from.  This procedure should help to recover lost or corrupted files from a filesystem using [http://manpages.ubuntu.com/manpages/lucid/man1/scalpel.1.html Scalpel], a data recovery utility built on the foundation of [http://foremost.sourceforge.net/ Foremost]
 
# Install Scalpel
#* <code> apt-get install scalpel </code>
# Update the config file to search for the lost files (uncomment/add as neccessary)
#* <code> /etc/scalpel/scalpel.conf </code>
#* For PHP files (not embedded in HTML) use <code> php n  50000  <?php          ?> </code>
# Create a folder for the recovered files to go to
#* <code> mkdir /tmp/recov </code>
# Launch Scalpel to trawl the disk image (will takes ages, and source disk will be under high load)
#* <code> scalpel /dev/mapper/svr-root -o /tmp/recov/ </code>
# Search through recovered files to find the data of interest
#* <code> grep -R "string you want to find" /tmp/recov/* </code>
 
== SSH ==
=== Server Hostname Change ===
If the hostname (or IP) of the server you are SSH'ing to changes, the old entry needs to be removed from your SSH key known hosts file
If the hostname (or IP) of the server you are SSH'ing to changes, the old entry needs to be removed from your SSH key known hosts file
* <code> ssh-keygen -R <name or IP> </code>
* <code> ssh-keygen -R <name or IP> </code>


= Packages =
== Packages ==
Errors etc received from <code>apt-get</code>
Errors etc received from <code>apt-get</code>
* '''Error 400 Bad Request'''
* '''Error 400 Bad Request'''
Line 38: Line 73:
** Package manager can hold back updates because they will cause conflicts, or sometimes because they're major kernel updates.  Running <code>aptitude safe-upgrade</code> normally seems to force kernel updates through.
** Package manager can hold back updates because they will cause conflicts, or sometimes because they're major kernel updates.  Running <code>aptitude safe-upgrade</code> normally seems to force kernel updates through.


= Reboot Required? =
=== Add EOL Repository ===
Once a version of Ubuntu has gone End Of Line (EOL), you can't install software packages using the normal repository.  On trying you'll get an error similar to
* <code>Failed to fetch http://gb.archive.ubuntu.com/ubuntu/pool/main/s/<package>  404 Not Found</code>
 
The repository is still available, but via a different URL -  http://old-releases.ubuntu.com
 
Edit <code>/etc/apt/sources.list</code> and add the following (replace hardy with your flavour of Ubuntu).  Remove the existing ubuntu repositories (they'll just cause errors as they're inaccessible)
 
<pre>
# Hardy EOL
# Required
deb http://old-releases.ubuntu.com/ubuntu/ hardy main restricted universe multiverse
deb http://old-releases.ubuntu.com/ubuntu/ hardy-updates main restricted universe multiverse
deb http://old-releases.ubuntu.com/ubuntu/ hardy-security main restricted universe multiverse
 
# Optional
#deb http://old-releases.ubuntu.com/ubuntu/ hardy-backports main restricted universe multiverse
</pre>
 
== Reboot Required? ==
If a package update/installation requires a reboot to complete the following file will exist...
If a package update/installation requires a reboot to complete the following file will exist...
  /var/run/reboot-required  
  /var/run/reboot-required  
Line 44: Line 98:
To see which packages caused this to be set, inspect the contents of...
To see which packages caused this to be set, inspect the contents of...
  /var/run/reboot-required.pkgs
  /var/run/reboot-required.pkgs
== Firewall ==
=== ERROR: problem running ufw-init ===
If on starting or reloading <code>ufw</code> you receive this error, its likely that you have a configuration problem.  This is especially likely if you've needed to edit <code>ufw</code>'s config files directly.
# Ensure that <code>ufw</code> is running
#* <code> ufw enable </code>
# Force the config to be reloaded
#* <code> /lib/ufw/ufw-init force-reload </code>
# Or if <code>ufw</code> failed to start use
#* <code> /lib/ufw/ufw-init start </code>
Doing the above should trigger the error, and present a better description of what the problem is
See http://ubuntuforums.org/showthread.php?t=1660916 for further info


[[Category:Ubuntu]]
[[Category:Ubuntu]]
[[Category:Troubleshooting]]
[[Category:Troubleshooting]]
[[Category:Bash]]

Latest revision as of 13:34, 26 September 2016

For performance problems related load, see High System Load

Network

No NIC

Especially after hardware changes, its possible the networking config no longer refers to the right interface.

  1. Use ifconfig to confirm the current network config
  2. Use dmesg | grep -i eth to ascertain what's been detected at boot time
  3. Assuming it states that say eth0 has been changed to eth1 then just update the /etc/network/interfaces file
  4. Alternatively, force the new NIC to be eth0 by editing the /etc/udev/rules.d/70-persistent-net.rules file
    • You'll need to reboot the server for changes to take effect

File System

Unable to Mount CD-ROM

Mounting drive with following command fails

  • mount /dev/cdrom /media/cdrom/

If /media/cdrom/ doesn't exist

  1. Create the file with mkdir /media/cdrom

If /dev/cdrom special device doesn't exist

  1. Check for existing mappings and devices
    • ls -l /dev/ | grep cdrom
  2. If an existing mapping exists but for a different drive number (eg cdrom2 -> sr0)
    • Then try mounting with that number
    • EG mount /dev/cdrom2 /media/cdrom/
  3. If no existing mapping exists
    • Then try creating one for one of the listed devices
    • EG ln -sf /dev/sg0 /dev/cdrom

Replacing a Software RAID 1 Disk

This procedure was written from the following starting point...

  • A machine originally with two disks in RAID1 has failed, one disk has been replaced, and machine started again

...and adapted from this post http://www.howtoforge.com/replacing_hard_disks_in_a_raid1_array

  1. Backup whatever you can before proceeding, one mistake or system error could destroy your machine
  2. Confirm which disk is new, and which is old (if the new disk is blank this is easy as there will be no partition info!)
    • fdisk -l
  3. Partition the new disk the same as the original
    • sfdisk -d /dev/sda | sfdisk /dev/sdb
  4. Confirm that the layout of both disks is now that same
    • fdisk -l
  5. Add the newly created partitions to the RAID disks
    • mdadm --manage /dev/md0 --add /dev/sdb1
    • You may have more sd partitions than md partitions, the array size return through mdadm -D /dev/md* should roughly match the number of blocks found from fdisk -l
  6. The arrays should now be being sync'ed, check progress by monitoring /proc/mdstat
    • more /proc/mdstat

Recover Deleted Files

Ideally you should recover files to a seperate disk partition to the one you are attempting to recover from. This procedure should help to recover lost or corrupted files from a filesystem using Scalpel, a data recovery utility built on the foundation of Foremost

  1. Install Scalpel
    • apt-get install scalpel
  2. Update the config file to search for the lost files (uncomment/add as neccessary)
    • /etc/scalpel/scalpel.conf
    • For PHP files (not embedded in HTML) use php n 50000 <?php  ?>
  3. Create a folder for the recovered files to go to
    • mkdir /tmp/recov
  4. Launch Scalpel to trawl the disk image (will takes ages, and source disk will be under high load)
    • scalpel /dev/mapper/svr-root -o /tmp/recov/
  5. Search through recovered files to find the data of interest
    • grep -R "string you want to find" /tmp/recov/*

SSH

Server Hostname Change

If the hostname (or IP) of the server you are SSH'ing to changes, the old entry needs to be removed from your SSH key known hosts file

  • ssh-keygen -R <name or IP>

Packages

Errors etc received from apt-get

  • Error 400 Bad Request
    • Somewhat misleadingly, the problem is normal caused by being unable to contact the update server. Consider adding proxy server config to your machine
  • The following packages have been kept back
    • Package manager can hold back updates because they will cause conflicts, or sometimes because they're major kernel updates. Running aptitude safe-upgrade normally seems to force kernel updates through.

Add EOL Repository

Once a version of Ubuntu has gone End Of Line (EOL), you can't install software packages using the normal repository. On trying you'll get an error similar to

The repository is still available, but via a different URL - http://old-releases.ubuntu.com

Edit /etc/apt/sources.list and add the following (replace hardy with your flavour of Ubuntu). Remove the existing ubuntu repositories (they'll just cause errors as they're inaccessible)

# Hardy EOL
# Required
deb http://old-releases.ubuntu.com/ubuntu/ hardy main restricted universe multiverse
deb http://old-releases.ubuntu.com/ubuntu/ hardy-updates main restricted universe multiverse
deb http://old-releases.ubuntu.com/ubuntu/ hardy-security main restricted universe multiverse

# Optional
#deb http://old-releases.ubuntu.com/ubuntu/ hardy-backports main restricted universe multiverse

Reboot Required?

If a package update/installation requires a reboot to complete the following file will exist...

/var/run/reboot-required 

To see which packages caused this to be set, inspect the contents of...

/var/run/reboot-required.pkgs

Firewall

ERROR: problem running ufw-init

If on starting or reloading ufw you receive this error, its likely that you have a configuration problem. This is especially likely if you've needed to edit ufw's config files directly.

  1. Ensure that ufw is running
    • ufw enable
  2. Force the config to be reloaded
    • /lib/ufw/ufw-init force-reload
  3. Or if ufw failed to start use
    • /lib/ufw/ufw-init start

Doing the above should trigger the error, and present a better description of what the problem is

See http://ubuntuforums.org/showthread.php?t=1660916 for further info