Rocks v5.2 Frontend Install And Configuration

First begin by installing your Frontend by following the directions in the Rocks users guide, make sure you include the service pack roll, as it's required and fixes some issues. Once you've done that you'll have a basic frontend installed but there are a few custom things that will have to be configured before our rocks build will work on it.

(As of writing this the current Frontend umrocks, was installed with SL5.3 and only the basic rolls, as well as the service packs. JCWR 2009/8/13)

NFS share /home/install/

Firstly, we want /home/install to be the location where our rocks dist is, cause that's how we've written everything since Rocks v4.3, however with the upgrade to Rocks v5.2 some things were changed, including where the rocks build resides. So what used to be "/export/home/install/" is now "/export/rocks/install", which would break almost all of our custom scripts no matter, it can be fixed.

Rocks v5.2 has an NFS directory that's shared with all the nodes, which is /share/apps (where you're supposed to put your custom scripts as of 5.2), we're going to change what's being shared. This is basically done by following these directions on custom configuration in the rocks user guide, but instead of adding lines we're just changing the ones for /share/apps to these new ones.

change /etc/exports to look like:
/export 10.10.0.0/255.255.254.0(rw,async) \
        10.1.0.0/255.255.254.0(rw,async)

Then restart nfs:
/etc/rc.d/init.d/nfs restart

Then edit /etc/auto.home so that it has the line:
install [frontend private ip]:/export/rocks/&

Then inform 411 of the change:
make -C /var/411

Now whenever you change your directory to /home/install it will be auto mounted, even on the compute nodes.

DNS server

Secondly, we want to get our DNS server working correctly with Rocks generating config files that include the machines that aren't Rocks nodes. It's almost entirely the same as setting it up for Rocks 4.3 via the RocksDNS post.

First copy over all of your old files in the /home/install/ directory, we'll especially need the extras/ directory.

Here's the working config:

The /etc/resolv.conf should look like (at U of M):
domain aglt2.org
nameserver 10.10.1.3
nameserver 10.10.2.15
nameserver 192.84.86.88
nameserver 198.108.1.42
nameserver 141.211.125.17
search local aglt2.org grid.umich.edu physics.lsa.umich.edu ultralight.org
nameserver 127.0.0.1

Change the file /etc/sysconfig/named by adding the line:
OPTIONS="-c /etc/named.conf.aglt2"

Add the marked line to /etc/rc.d/init.d/named (extra lines for context):
# Source networking configuration.
[ -r /etc/sysconfig/network ] && . /etc/sysconfig/network

[ -r /etc/sysconfig/named ] && . /etc/sysconfig/named

export KRB5_KTNAME=${KEYTAB_FILE:-/etc/named.keytab}


# Source aglt2.org customization
. /export/rocks/install/extras/make-named-conf.sh       <----- ADD THIS LINE!

# Don't kill named during clean-up
NAMED_SHUTDOWN_TIMEOUT=${NAMED_SHUTDOWN_TIMEOUT:-100}

if [ -n "$ROOTDIR" ]; then
   ROOTDIR=`echo $ROOTDIR | sed 's#//*#/#g;s#/$##'`;
   rdl=`/usr/bin/readlink $ROOTDIR`;
   if [ -n "$rdl" ]; then
      ROOTDIR="$rdl";
   fi;
fi

Then just make sure you brought over our extras folder and placed it in the /home/install/ directory, because we're going to rely on the files:
  • /home/install/extras/named.conf-append
  • /home/install/extras/make-named-conf.sh
  • /home/install/extras/pull-msu-dns.sh

Incidentally, if you've misplaced or lost said files here's what they look like:

/home/install/extras/named.conf-append:
zone "1.1.10.in-addr.arpa" {
        type master;
        notify no;
        file "reverse.rocks.domain.1.1.10.local";
};

zone "2.10.10.in-addr.arpa" {
        type slave;
        file "slaves/reverse.rocks.domain.2.10.10";
        masters { 10.10.2.15; };
};

zone "3.10.10.in-addr.arpa" {
        type slave;
        file "slaves/reverse.rocks.domain.3.10.10";
        masters { 10.10.2.15; };
};

/home/install/extras/make-named-conf.sh:
cp -f /etc/named.conf /etc/named.conf.aglt2
cat /export/rocks/install/extras/named.conf-append >> /etc/named.conf.aglt2
/export/rocks/install/extras/pull-msu-dns.sh

/home/install/extras/pull-msu-dns.sh:
#!/bin/bash

OUTPUT="/var/named/rocks.domain.local"
APPEND="/var/named/notrocks.domain.local"
TEMP="/var/named/msu-zone-transfer.tmp"

echo -e ";;entries by /export/rocks/install/extras/pull-msu-dns.sh \n" \
> $OUTPUT

dig @10.10.2.15 local AXFR > $TEMP

grep -q failed $TEMP
FAILED=$?
grep -q 'connection timed out' $TEMP
TIMEOUT=$?


if [ $FAILED -eq 0 ]
then
        echo -e "DNS transfer failed \n"
fi

if [ $TIMEOUT -eq 0 ]
then
        echo -e "Connection to DNS server timed out \n"
fi


cat $TEMP | \
awk '($4 == "A") && ($5 ~ /10.10.[23]./) {print}' \
>> $OUTPUT

if [ -e $APPEND ]
then
        cat $APPEND >> $OUTPUT
fi

Now just rocks sync config to have it regenerate the dhcpd.conf file as well and the domain files.

Routing to Public Network

By default all the nodes wanted to route to the public net (aglt2) through umrocks, we don't want this even though this is one of the main tenants of rocks doctrine. I believe when I originally set up umrocks the reason this didn't work is because some names had changed in the rocks node xml tree, but since you're already on 5.2 hopefully this should work by default. If it doesn't change the default route on the Frontend to be to gw.aglt2.org, run rocks sync network and hopefully that should be it.

User Login to Nodes

The nodes by default won't allow users to ssh into the nodes, not because of AFS issues, though that's a possibility, but they most certainly won't allow it if the users don't exist. To solve this issue just add all the requisite users to the Frontend, and run rocks sync users, then rocks will distribute all the users to the nodes and it should be possible to ssh into them.

Updating initrd

There is a bug in the initrd that is stock with the kernel. Corrected rpms must be installed. These are from the 'sl-fastbugs' repo. Six rpms were pulled in during a "yum update" on umrocks, and three were added to the rocks build distro as well. These were:

mkinitrd-devel-5.1.19.6-44.1.i386    (not added)
mkinitrd-devel-5.1.19.6-44.1.x86_64  (not added)
libbdevid-python-5.1.19.6-44.1.x86_64 (not added)
mkinitrd-5.1.19.6-44.1.i386.rpm
mkinitrd-5.1.19.6-44.1.x86_64.rpm
nash-5.1.19.6-44.1.x86_64.rpm

Other Changes

The Xen network interface appears to be starting up by default. Turn off the service
chkconfig libvirtd off

The interface shows up looking like this:
virbr0    Link encap:Ethernet  HWaddr 00:00:00:00:00:00
          inet addr:192.168.122.1  Bcast:192.168.122.255  Mask:255.255.255.0
          inet6 addr: fe80::200:ff:fe00:0/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:6 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:0 (0.0 b)  TX bytes:468 (468.0 b)

Found repeating messages in the /var/log/messages file, followed up on Rocks list and found the following solution to this issue.
error: got a metric with a strange type (22)

> I presume greceptor is still logging verbosely and we're seeing it 
> because we're saving all logs. I was hoping the patch in 5.2.1 modified 
> grecpetor configuration, not just ignoring the syslog warnings.

The problem seems to be addressed by commenting out line 205

   self.app.warning(errmsg)

in /opt/rocks/lib/python2.4/site-packages/gmon/reporter.py as described here:

https://lists.sdsc.edu/pipermail/npaci-rocks-discussion/2009-September/042759.html

Thanks!
Lew

-- JamesWright - 13 Aug 2009
Topic revision: r5 - 13 Nov 2009, BobBall
This site is powered by FoswikiCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Foswiki? Send feedback