| Red Hat Cluster Manager: The Red Hat Cluster Manager Installation and Administration Guide | ||
|---|---|---|
| Prev | Chapter 2. Hardware Installation and Operating System Configuration | Next |
After the setup of basic system hardware, proceed with installation of Red Hat Linux on both cluster systems and ensure that they recognize the connected devices. Follow these steps:
Install the Red Hat Linux distribution on both cluster systems. If customizing the kernel, be sure to follow the kernel requirements and guidelines described in the Section called Kernel Requirements.
Reboot the cluster systems.
When using a terminal server, configure Linux to send console messages to the console port.
Edit the /etc/hosts file on each cluster system and include the IP addresses used in the cluster. See the Section called Editing the /etc/hosts File for more information about performing this task.
Decrease the alternate kernel boot timeout limit to reduce cluster system boot time. See the Section called Decreasing the Kernel Boot Timeout Limit for more information about performing this task.
Ensure that no login (or getty) programs are associated with the serial ports that are being used for the serial heartbeat channel or the remote power switch connection (if applicable). To perform this task, edit the /etc/inittab file and use a pound symbol (#) to comment out the entries that correspond to the serial ports used for the serial channel and the remote power switch. Then, invoke the init q command.
Verify that both systems detect all the installed hardware:
Use the dmesg command to display the console startup messages. See the Section called Displaying Console Startup Messages for more information about performing this task.
Use the cat /proc/devices command to display the devices configured in the kernel. See the Section called Displaying Devices Configured in the Kernel for more information about performing this task.
Verify that the cluster systems can communicate over all the network interfaces by using the ping command to send test packets from one system to the other.
If intending to configure Samba services, verify that the Samba related RPM packages are installed on your system.
When manually configuring the kernel, adhere to the following are kernel requirements:
Enable IP Aliasing support in the kernel by setting the CONFIG_IP_ALIAS kernel option to y. When specifying kernel options, under Networking Options, select IP aliasing support.
Enable support for the /proc file system by setting the CONFIG_PROC_FS kernel option to y. When specifying kernel options, under Filesystems, select /proc filesystem support.
Ensure that the SCSI driver is started before the cluster software. For example, edit the startup scripts so that the driver is started before the cluster script. It is also possible to statically build the SCSI driver into the kernel, instead of including it as a loadable module, by modifying the /etc/modules.conf file.
In addition, when installing the Linux distribution, it is strongly recommended to do the following:
Gather the IP addresses for the cluster systems and for the point-to-point Ethernet heartbeat interfaces, before installing a Linux distribution. Note that the IP addresses for the point-to-point Ethernet interfaces can be private IP addresses, (for example, 10.x.x.x).
Optionally, reserve an IP address to be used as the "cluster alias". This address is typically used to facilitate remote monitoring.
Enable the following Linux kernel options to provide detailed information about the system configuration and events and help you diagnose problems:
Enable SCSI logging support by setting the CONFIG_SCSI_LOGGING kernel option to y. When specifying kernel options, under SCSI Support, select SCSI logging facility.
Enable support for sysctl by setting the CONFIG_SYSCTL kernel option to y. When specifying kernel options, under General Setup, select Sysctl support.
Do not place local file systems, such as /, /etc, /tmp, and /var on shared disks or on the same SCSI bus as shared disks. This helps prevent the other cluster member from accidentally mounting these file systems, and also reserves the limited number of SCSI identification numbers on a bus for cluster disks.
Place /tmp and /var on different file systems. This may improve system performance.
When a cluster system boots, be sure that the system detects the disk devices in the same order in which they were detected during the Linux installation. If the devices are not detected in the same order, the system may not boot.
When using RAID storage configured with Logical Unit Numbers (LUNs) greater than zero, it is necessary to enable LUN support by adding the following to /etc/modules.conf:
options scsi_mod max_scsi_luns=255 |
After modifying modules.conf, it is necessary to rebuild the initial ram disk using mkinitrd. Refer to the Official Red Hat Linux Customization Guide for more information about creating ramdisks using mkinitrd.
The /etc/hosts file contains the IP address-to-hostname translation table. The /etc/hosts file on each cluster system must contain entries for the following:
IP addresses and associated host names for both cluster systems
IP addresses and associated host names for the point-to-point Ethernet heartbeat connections (these can be private IP addresses)
As an alternative to the /etc/hosts file, naming services such as DNS or NIS can be used to define the host names used by a cluster. However, to limit the number of dependencies and optimize availability, it is strongly recommended to use the /etc/hosts file to define IP addresses for cluster network interfaces.
The following is an example of an /etc/hosts file on a cluster system:
127.0.0.1 localhost.localdomain localhost 193.186.1.81 cluster2.yourdomain.com cluster2 10.0.0.1 ecluster2.yourdomain.com ecluster2 193.186.1.82 cluster3.yourdomain.com cluster3 10.0.0.2 ecluster3.yourdomain.com ecluster3 193.186.1.83 clusteralias.yourdomain.com clusteralias |
The previous example shows the IP addresses and host names for two cluster systems (cluster2 and cluster3), and the private IP addresses and host names for the Ethernet interface used for the point-to-point heartbeat connection on each cluster system (ecluster2 and ecluster3) as well as the IP alias clusteralias used for remote cluster monitoring.
Verify correct formatting of the local host entry in the /etc/hosts file to ensure that it does not include non-local systems in the entry for the local host. An example of an incorrect local host entry that includes a non-local system (server1) is shown next:
127.0.0.1 localhost.localdomain localhost server1 |
A heartbeat channel may not operate properly if the format is not correct. For example, the channel will erroneously appear to be offline. Check the /etc/hosts file and correct the file format by removing non-local systems from the local host entry, if necessary.
Note that each network adapter must be configured with the appropriate IP address and netmask.
The following is an example of a portion of the output from the /sbin/ifconfig command on a cluster system:
# ifconfig
eth0 Link encap:Ethernet HWaddr 00:00:BC:11:76:93
inet addr:192.186.1.81 Bcast:192.186.1.245 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:65508254 errors:225 dropped:0 overruns:2 frame:0
TX packets:40364135 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:100
Interrupt:19 Base address:0xfce0
eth1 Link encap:Ethernet HWaddr 00:00:BC:11:76:92
inet addr:10.0.0.1 Bcast:10.0.0.245 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:100
Interrupt:18 Base address:0xfcc0 |
The previous example shows two network interfaces on a cluster system: The eth0 network interface for the cluster system and the eth1 (network interface for the point-to-point heartbeat connection).
It is possible to reduce the boot time for a cluster system by decreasing the kernel boot timeout limit. During the Linux boot sequence, the bootloader allows for specifying an alternate kernel to boot. The default timeout limit for specifying a kernel is ten seconds.
To modify the kernel boot timeout limit for a cluster system, edit the /etc/lilo.conf file and specify the desired value (in tenths of a second) for the timeout parameter. The following example sets the timeout limit to three seconds:
timeout = 30 |
To apply any changes made to the /etc/lilo.conf file, invoke the /sbin/lilo command.
Similarly, when using the grub boot loader, the timeout parameter in /boot/grub/grub.conf should be modified to specify the appropriate number of seconds before timing out. To set this interval to 3 seconds, edit the parameter to the following:
timeout = 3 |
Use the dmesg command to display the console startup messages. See the dmesg(8) manual page for more information.
The following example of the dmesg command output shows that a serial expansion card was recognized during startup:
May 22 14:02:10 storage3 kernel: Cyclades driver 2.3.2.5 2000/01/19 14:35:33 May 22 14:02:10 storage3 kernel: built May 8 2000 12:40:12 May 22 14:02:10 storage3 kernel: Cyclom-Y/PCI #1: 0xd0002000-0xd0005fff, IRQ9, 4 channels starting from port 0. |
The following example of the dmesg command output shows that two external SCSI buses and nine disks were detected on the system (note that lines with forward slashes will be printed as one line on most screens):
May 22 14:02:10 storage3 kernel: scsi0 : Adaptec AHA274x/284x/294x \
(EISA/VLB/PCI-Fast SCSI) 5.1.28/3.2.4
May 22 14:02:10 storage3 kernel:
May 22 14:02:10 storage3 kernel: scsi1 : Adaptec AHA274x/284x/294x \
(EISA/VLB/PCI-Fast SCSI) 5.1.28/3.2.4
May 22 14:02:10 storage3 kernel:
May 22 14:02:10 storage3 kernel: scsi : 2 hosts.
May 22 14:02:11 storage3 kernel: Vendor: SEAGATE Model: ST39236LW Rev: 0004
May 22 14:02:11 storage3 kernel: Detected scsi disk sda at scsi0, channel 0, id 0, lun 0
May 22 14:02:11 storage3 kernel: Vendor: SEAGATE Model: ST318203LC Rev: 0001
May 22 14:02:11 storage3 kernel: Detected scsi disk sdb at scsi1, channel 0, id 0, lun 0
May 22 14:02:11 storage3 kernel: Vendor: SEAGATE Model: ST318203LC Rev: 0001
May 22 14:02:11 storage3 kernel: Detected scsi disk sdc at scsi1, channel 0, id 1, lun 0
May 22 14:02:11 storage3 kernel: Vendor: SEAGATE Model: ST318203LC Rev: 0001
May 22 14:02:11 storage3 kernel: Detected scsi disk sdd at scsi1, channel 0, id 2, lun 0
May 22 14:02:11 storage3 kernel: Vendor: SEAGATE Model: ST318203LC Rev: 0001
May 22 14:02:11 storage3 kernel: Detected scsi disk sde at scsi1, channel 0, id 3, lun 0
May 22 14:02:11 storage3 kernel: Vendor: SEAGATE Model: ST318203LC Rev: 0001
May 22 14:02:11 storage3 kernel: Detected scsi disk sdf at scsi1, channel 0, id 8, lun 0
May 22 14:02:11 storage3 kernel: Vendor: SEAGATE Model: ST318203LC Rev: 0001
May 22 14:02:11 storage3 kernel: Detected scsi disk sdg at scsi1, channel 0, id 9, lun 0
May 22 14:02:11 storage3 kernel: Vendor: SEAGATE Model: ST318203LC Rev: 0001
May 22 14:02:11 storage3 kernel: Detected scsi disk sdh at scsi1, channel 0, id 10, lun 0
May 22 14:02:11 storage3 kernel: Vendor: SEAGATE Model: ST318203LC Rev: 0001
May 22 14:02:11 storage3 kernel: Detected scsi disk sdi at scsi1, channel 0, id 11, lun 0
May 22 14:02:11 storage3 kernel: Vendor: Dell Model: 8 BAY U2W CU Rev: 0205
May 22 14:02:11 storage3 kernel: Type: Processor \
ANSI SCSI revision: 03
May 22 14:02:11 storage3 kernel: scsi1 : channel 0 target 15 lun 1 request sense \
failed, performing reset.
May 22 14:02:11 storage3 kernel: SCSI bus is being reset for host 1 channel 0.
May 22 14:02:11 storage3 kernel: scsi : detected 9 SCSI disks total. |
The following example of the dmesg command output shows that a quad Ethernet card was detected on the system:
May 22 14:02:11 storage3 kernel: 3c59x.c:v0.99H 11/17/98 Donald Becker May 22 14:02:11 storage3 kernel: tulip.c:v0.91g-ppc 7/16/99 becker@cesdis.gsfc.nasa.gov May 22 14:02:11 storage3 kernel: eth0: Digital DS21140 Tulip rev 34 at 0x9800, \ 00:00:BC:11:76:93, IRQ 5. May 22 14:02:12 storage3 kernel: eth1: Digital DS21140 Tulip rev 34 at 0x9400, \ 00:00:BC:11:76:92, IRQ 9. May 22 14:02:12 storage3 kernel: eth2: Digital DS21140 Tulip rev 34 at 0x9000, \ 00:00:BC:11:76:91, IRQ 11. May 22 14:02:12 storage3 kernel: eth3: Digital DS21140 Tulip rev 34 at 0x8800, \ 00:00:BC:11:76:90, IRQ 10. |
To be sure that the installed devices, including serial and network interfaces, are configured in the kernel, use the cat /proc/devices command on each cluster system. Use this command to also determine if there is raw device support installed on the system. For example:
# cat /proc/devices Character devices: 1 mem 2 pty 3 ttyp 4 ttyS 5 cua 7 vcs 10 misc 19 ttyC 20 cub 128 ptm 136 pts 162 raw Block devices: 2 fd 3 ide0 8 sd 65 sd # |
The previous example shows:
Onboard serial ports (ttyS)
Serial expansion card (ttyC)
Raw devices (raw)
SCSI devices (sd)