Installation of OSG 0.6.0 on gate01.aglt2.org

The installation procedure for OSG 0.6.0 on gate01.aglt2.org is below. It was installed on April 2nd, 2007. Please refer to the OSG CE Installation Twiki for details about the procedure.

Setup and Preparation

First the gate01.aglt2.org node was updated via yum and the network setup was converted to use aglt2.org instead of the original grid.umich.edu.

A set of DOE Grids certificates where obtained for
  • Host: gate01.aglt2.org in /etc/grid-security
  • LDAP: ldap/gate01.aglt2.org in /etc/grid-security/ldap
  • HTTP: http/gate01.aglt2.org in /etc/grid-security/http

The script that was used to install VDT160 was updated on gate01.aglt2.org. It is in /root/install_osg.sh (original script was called install_vdt.sh):

#!/bin/bash
#
# Make sure we setup/install OK in AFS space for 32 bit
#
#

export VDT_PRETEND_32=1
# To try to preserve some settings from a prior install set this...
#export OLD_VDT_LOCATION=/afs/atlas.umich.edu/OSG

# Use existing CONDOR install
export VDTSETUP_CONDOR_LOCATION=/opt/condor
export VDTSETUP_CONDOR_CONFIG=$VDTSETUP_CONDOR_LOCATION/etc/condor_config

# Setup Pacman
wget http://physics.bu.edu/pacman/sample_cache/tarballs/pacman-3.19.tar.gz
tar --no-same-owner -xzvf pacman-3.19.tar.gz

cd /opt/pacman-3.19

source setup.sh

# Make sure we have AFS admin tokens
token=`tokens | grep "AFS ID" | awk '{print $4}' | awk -F\) '{print $1}'`

if [ $token -ne 1 ]
then
    kinit admin
    aklog
fi

# Create volume for install
vos create linat08.grid.umich.edu /vicepg OSG060 10000000 -verbose
fs mkmount /afs/.atlas.umich.edu/OSG060 OSG060 -rw
vos release root.cell
vos release root.afs
fs checkvolumes


# Goto install directory
cd /afs/atlas.umich.edu/OSG060


# Set AFS ACLs to allow full access to everything during install
fs setacl /afs/atlas.umich.edu/OSG060 system:anyuser rliwd

# set umask
umask 0022

# Do installation and asnwer questions
pacman -get OSG:ce

echo " Finished OSG:ce install "
echo " "
echo " "

# Done?  Reset ACLs
echo "Must reset ACLs on AFS install area..."
find /afs/atlas.umich.edu/OSG060 -type d -exec /usr/bin/fs setacl {} system:anyuser rl \;

# Setup environment
cd /afs/atlas.umich.edu/OSG060
source setup.sh

# Get Condor-Setup
echo " "
echo " Installing OSG:Globus-Condor-Setup "
echo " "
pacman -get OSG:Globus-Condor-Setup

# Setup managed fork
echo " "
echo " Installing OSG:ManagedFork "
echo " "
pacman -get OSG:ManagedFork
echo " "
echo " Done with Pacman installs "
echo " "

# Configure default jobmanager to be managed fork
$VDT_LOCATION/vdt/setup/configure_globus_gatekeeper --managed-fork y --server y

# Protect certificates so services can get them...
chown -R daemon.daemon /etc/grid-security/ldap
chown -R daemon.daemon /etc/grid-security/http

Script was started around 1:35 PM. Answered with defaults.

It finished successfully (no errors) around 2:38 PM.

Post-Installation work.

There are a number of configuration and verification steps needed after the installation.

We would like to run the new managed fork capability so this will need to be installed and configured (see below).

All needed certificates were already obtained (see above) for the host, ldap and http. There is some setup noted:
Before you can request user, host or service certificates with:
  /afs/atlas.umich.edu/OSG060/globus/bin/grid-cert-request

you must first configure your GSI settings with 
  /afs/atlas.umich.edu/OSG060/vdt/setup/setup-cert-request

Running this gives:
[gate01:grid-security]# source /afs/atlas.umich.edu/OSG060/setup.sh
[gate01:grid-security]# /afs/atlas.umich.edu/OSG060/vdt/setup/setup-cert-request 
Reading from /afs/atlas.umich.edu/OSG060/globus/TRUSTED_CA
Using hash: 1c3f2ca8
Setting up grid-cert-request
Running grid-security-config...

Before you use the Grid Security Infrastructure, you should first
define the DN (distinguished name) that should be used for your
organization's X509 certificates.  If you do not define a DN,
a default DN will be assigned to you.

For some questions, a default response is given in [].
Pressing RETURN in response to such a question will enable the default.
This script will overwrite the file --

     /afs/atlas.umich.edu/OSG060/globus/etc/grid-security.conf


========================================================================

(1) Base DN for user certificates
         [ OU=People,DC=doegrids,DC=org ] 
(2) Base DN for host certificates
         [ OU=Services,DC=doegrids,DC=org ] 

========================================================================
(q) save, configure the GSI and Quit
(c) Cancel (exit without saving or configuring)
(h) Help
========================================================================

q
Successfully created cert request configuration files in:
/afs/atlas.umich.edu/OSG060/globus/etc

I then setup the ManagedFork limits by editing /opt/condor/etc/condor_config.local and adding
# ManagedFork limit
START_LOCAL_UNIVERSE = TotalLocalJobsRunning < 20 || GridMonitorJob =?= TRUE

Then I noticed the /etc/grid-security/certificates was pointing to the old VDT160 install location. I reset this to be the OSG060 one:
ln -s /afs/atlas.umich.edu/OSG060/globus/share/certificates /etc/grid-security/certificates

The instructions say to check the /etc/xinetd.d files (which were old). But it seems you need to run vdt-control first:

[gate01:OSG060]# which vdt-control
/afs/atlas.umich.edu/OSG060/vdt/sbin/vdt-control
[gate01:OSG060]# vdt-control --on
enabling cron service fetch-crl... ok
enabling cron service vdt-rotate-logs... ok
skipping init service 'gris' -- marked as disabled
enabling inetd service globus-gatekeeper... FAILED! (see vdt-install.log)
    found conflicting, non-VDT entry for service globus-gatekeeper in /etc/services;
    use the --force option to remove the entry
enabling inetd service gsiftp... FAILED! (see vdt-install.log)
    found conflicting, non-VDT entry for service gsiftp in /etc/services;
    use the --force option to remove the entry
enabling init service mysql... FAILED! (see vdt-install.log)
    conflicting, non-VDT file: /etc/rc.d/init.d/mysql
    use the --force option to backup and overwrite
enabling init service globus-ws... FAILED! (see vdt-install.log)
    conflicting, non-VDT file: /etc/rc.d/init.d/globus-ws
    use the --force option to backup and overwrite
skipping cron service 'edg-mkgridmap' -- marked as disabled
enabling cron service gums-host-cron... ok
skipping init service 'MLD' -- marked as disabled
enabling init service apache... FAILED! (see vdt-install.log)
    conflicting, non-VDT file: /etc/rc.d/init.d/apache
    use the --force option to backup and overwrite
enabling init service tomcat-5... FAILED! (see vdt-install.log)
    conflicting, non-VDT file: /etc/rc.d/init.d/tomcat-5
    use the --force option to backup and overwrite
enabling cron service gratia-condor... ok

So it seems there are some issues to resolve. I first edited my 'root' crontab to remove the old VDT160 entries. I ended up redoing the command as vdt-control --on --force which backs up and overwrites all services.

I turned back on MLD via: vdt-register-service --name  MLD --enable

The errors from vdt-control are because of AFS issues. See the section below on how we resolved this.

Installation on AFS Issues

Since we are installing in AFS we have to fix some issues. The rationale for using AFS is that other clients can utilize this installation. However because files are in AFS we are preventing (without getting appropriate tokens) from writing files, including log files. Additionally if multiple clients were to use this installation they would each require their own config/setup information.

To make this setup functional requires us to locate all files (or even whole directories) which have "log"-type files or configuration information and soft-link them to a local (writeable) filesystem. I choose /opt/OSG060 as the base of the soft-link area. We need to identify every file/directory which must be written OR contains configuration information and:
  • Make a copy of this file or directory into /opt/OSG060/...
  • Rename the original in AFS to <filename>_orig
  • Create a soft-link in afs to the new location /opt/OSG060/...

Note that this procedure must be done for the FIRST installation. Succeeding installations need to copy (and edit) the files in <filename>_orig to the correct name in their local /opt/OSG060 area so the existing soft-links in AFS point to the correct location. Of course files requiring edits to setup the correct configuration for this new host must also be done.

First attempt to locate all needed files for relocation:
[gate01:globus]# pwd
/afs/atlas.umich.edu/OSG060/globus
[gate01:globus]# cd ..
[gate01:OSG060]# ffind .log
./o..pacman..o/logs/pacman.log
./o..pacman..o/logs/wget.log
./o..pacman..o/logs/shellout.log
./vdt-install.log
./vdt/etc/vdt.logrotate
./vdt/backup/vdt/vdt/etc/vdt.logrotate_001_20070402-214336
./vdt/backup/vdt/vdt/etc/vdt.logrotate_002_20070402-214525
./vdt/backup/vdt/vdt/etc/vdt.logrotate_003_20070402-214550
./vdt/backup/vdt/vdt/etc/vdt.logrotate_004_20070402-221532
./vdt/backup/vdt/vdt/etc/vdt.logrotate_005_20070402-221758
./vdt/backup/vdt/vdt/etc/vdt.logrotate_006_20070402-223331
./vdt/backup/vdt/vdt/etc/vdt.logrotate_007_20070405-073139
./globus/var/globus-fork.log
./globus/var/log/gridftp.log
./globus/var/log/gridftp-auth.log
./globus/var/gridftp.log
./globus/var/globus-condor.log
./globus/var/globus-gatekeeper.log
./globus/var/accounting.log
./globus/setup/globus/config.log
./MonaLisa/Service/usr_code/VoModules-v0.36/testdata/gridftp.log.SAVE
./MonaLisa/Service/usr_code/VoModules-v0.36/testlogs/VoJobs-SGE.log
./MonaLisa/Service/usr_code/VoModules-v0.36/testlogs/VoJobs-FBS.log
./MonaLisa/Service/usr_code/VoModules-v0.36/testlogs/VoJobs-LSF.log
./MonaLisa/Service/usr_code/VoModules-v0.36/testlogs/VoJobs-CONDOR.log
./MonaLisa/Service/usr_code/VoModules-v0.36/testlogs/VoJobs-PBS.log
./MonaLisa/Service/usr_code/VoModules-v0.36/testlogs/VOgsiftpIO.log
./apache/logs/mod_jk.log
./gratia/var/logs/gratia-probe-condor.log

We also need to find all configuration files:

[gate01:OSG060]# ffind ".conf"
./vdt/etc/package_data/Fetch-CRL.configdiff
./vdt/etc/package_data/Globus-Base-Info-Server.configdiff
./vdt/etc/package_data/Globus-Base-RM-Server.configdiff
./vdt/etc/package_data/Globus-Base-Data-Server.configdiff
./vdt/etc/package_data/Globus-Base-WS-Essentials.configdiff
./vdt/etc/package_data/Globus-Base-RFT-Server.configdiff
./vdt/etc/package_data/Globus-Base-WSGRAM-Server.configdiff
./vdt/etc/package_data/GUMS-Client.configdiff
./vdt/etc/package_data/Job-Environment.configdiff
./vdt/etc/package_data/MonaLisa.configdiff
./vdt/etc/package_data/Apache.configdiff
./vdt/etc/package_data/Tomcat-5.configdiff
./vdt/etc/package_data/CEMon.configdiff
./vdt/etc/package_data/Gratia.configdiff
./vdt/backup/vdt/globus/etc/globus-job-manager.conf_001_20070402-214525
./vdt/backup/vdt/globus/etc/globus-gatekeeper.conf_001_20070402-214525
./vdt/backup/vdt/globus/etc/globus-gatekeeper.conf_002_20070402-223331
./vdt/backup/vdt/globus/etc/globus-gatekeeper.conf_003_20070405-073138
./vdt/backup/vdt/apache/conf/httpd.conf_001_20070402-221531
./vdt/backup/vdt/apache/conf/extra/httpd-ssl.conf_001_20070402-221531
./vdt/backup/vdt/apache/conf/httpd.conf_002_20070402-221758
./vdt/backup/vdt/apache/conf/httpd.conf_003_20070402-221826
./monitoring/osg-attributes.conf
./monitoring/grid3-info.conf
./globus/etc/grid3-info.conf
./globus/etc/osg-attributes.conf
./globus/etc/openldap/ldap.conf
./globus/etc/openldap/ldap.conf.default
./globus/etc/openldap/ldapfilter.conf
./globus/etc/openldap/ldapfilter.conf.default
./globus/etc/openldap/ldaptemplates.conf
./globus/etc/openldap/ldaptemplates.conf.default
./globus/etc/openldap/ldapsearchprefs.conf
./globus/etc/openldap/ldapsearchprefs.conf.default
./globus/etc/openldap/slapd.conf.default
./globus/etc/openldap/slapd.conf
./globus/etc/grid-info.conf
./globus/etc/grid-info-resource-ldif.conf
./globus/etc/grid-info-resource-register.conf
./globus/etc/gridftp-resource.conf
./globus/etc/grid-info-slapd.conf
./globus/etc/grid-info-site-giis.conf
./globus/etc/grid-info-site-policy.conf
./globus/etc/grid-info-server-env.conf
./globus/etc/grid-info-deployment-comments.conf
./globus/etc/globus-gatekeeper.conf
./globus/etc/globus-fork.conf
./globus/etc/globus-job-manager.conf
./globus/etc/gridftp.conf
./globus/etc/globus_wsrf_test_unit/local-config-authz-test.conf
./globus/etc/globus_gram_local_proxy_tool.conf
./globus/etc/globus-condor.conf
./globus/etc/grid-security.conf
./globus/etc/globus-user-ssl.conf
./globus/etc/globus-host-ssl.conf
./globus/share/certificates/doegrids/globus-host-ssl.conf.1c3f2ca8
./globus/share/certificates/doegrids/globus-user-ssl.conf.1c3f2ca8
./globus/share/certificates/doegrids/grid-security.conf.1c3f2ca8
./globus/share/certificates/doegrids.orig/globus-host-ssl.conf.1c3f2ca8
./globus/share/certificates/doegrids.orig/globus-user-ssl.conf.1c3f2ca8
./globus/share/certificates/doegrids.orig/grid-security.conf.1c3f2ca8
./globus/share/certificates/globus-host-ssl.conf.1c3f2ca8
./globus/share/certificates/globus-user-ssl.conf.1c3f2ca8
./globus/share/certificates/grid-security.conf.1c3f2ca8
./globus/share/myproxy/myproxy-server.config
./globus/share/myproxy/etc.inetd.conf.modifications
./globus/man/man5/ldap.conf.5
./globus/man/man5/ldapfilter.conf.5
./globus/man/man5/ldapsearchprefs.conf.5
./globus/man/man5/ldaptemplates.conf.5
./globus/man/man5/slapd.conf.5
./globus/man/man5/ud.conf.5
./globus/man/man5/myproxy-server.config.5
./globus/setup/globus/globus_gaa.conf
./globus/setup/globus/globus_gaa_custom.conf
./globus/setup/globus/gsi-gaa.conf.tmpl
./globus/setup/globus/grid-info.conf.in
./globus/setup/globus/grid-info.conf
./globus/setup/globus/grid-info-resource-ldif.conf.in
./globus/setup/globus/grid-info-resource-register.conf.in
./globus/setup/globus/grid-info-slapd.conf.in
./globus/setup/globus/grid-info-site-giis.conf.in
./globus/setup/globus/grid-info-site-policy.conf.in
./globus/setup/globus/grid-info-server-env.conf.in
./globus/setup/globus/gridftp-resource.conf.in
./globus/setup/globus/grid-info-deployment-comments.conf
./globus/setup/globus/grid-info-resource-ldif.conf
./globus/setup/globus/grid-info-resource-register.conf
./globus/setup/globus/grid-info-slapd.conf
./globus/setup/globus/grid-info-server-env.conf
./globus/setup/globus/gridftp-resource.conf
./globus/setup/globus/grid-info-site-giis.conf
./globus/setup/globus/grid-info-site-policy.conf
./post-install/gsi-authz.conf
./post-install/prima-authz.conf
./gpt/etc/gpt/globus_flavor_labels.conf
./lcg/etc/add-attributes.conf.example
./lcg/etc/alter-attributes.conf.example
./edg/etc/edg-mkgridmap.conf
./edg/etc/edg-mkgridmap.conf.orig
./edg/share/doc/edg-mkgridmap-conf-2.8.1/html/edg-mkgridmap.conf.html
./edg/share/man/man5/edg-mkgridmap.conf.5.gz
./glite/etc/glite-ce-ce-plugin/lcg-info-generic.conf.example.lsf
./glite/etc/glite-ce-ce-plugin/lcg-info-generic.conf.example.pbs
./MonaLisa/Service/usr_code/XDRUDP/XDRUDP.conf
./MonaLisa/Service/usr_code/NetFlowModule/NetFlow.config
./MonaLisa/Service/VDTFarm/vdtFarm.conf
./MonaLisa/Service/VDTFarm/db.conf.embedded
./python/lib/python2.3/config/Setup.config
./apache/etc/pear.conf
./apache/conf/original/extra/httpd-userdir.conf
./apache/conf/original/extra/httpd-mpm.conf
./apache/conf/original/extra/httpd-multilang-errordoc.conf
./apache/conf/original/extra/httpd-manual.conf
./apache/conf/original/extra/httpd-ssl.conf
./apache/conf/original/extra/httpd-autoindex.conf
./apache/conf/original/extra/httpd-info.conf
./apache/conf/original/extra/httpd-dav.conf
./apache/conf/original/extra/httpd-vhosts.conf
./apache/conf/original/extra/httpd-languages.conf
./apache/conf/original/extra/httpd-default.conf
./apache/conf/original/httpd.conf
./apache/conf/httpd.conf.bak
./apache/conf/extra/httpd-userdir.conf
./apache/conf/extra/httpd-mpm.conf
./apache/conf/extra/httpd-multilang-errordoc.conf
./apache/conf/extra/httpd-manual.conf
./apache/conf/extra/httpd-ssl.conf
./apache/conf/extra/httpd-autoindex.conf
./apache/conf/extra/httpd-info.conf
./apache/conf/extra/httpd-dav.conf
./apache/conf/extra/httpd-vhosts.conf
./apache/conf/extra/httpd-languages.conf
./apache/conf/extra/httpd-default.conf
./apache/conf/httpd.conf

I created a shell script (bash) for the first-time relocations needed for all such files. It is meant to be run from $VDT_LOCATION only once after you have done the initial install into an AFS location. You need to specify the "redirect" local directory which will host the writeable files and be soft-linked to.

The list of files/directories for OSG 0.6.0 (VDT 1.6.1) is:
[gate01:opt]# ls -R OSG060/
OSG060/:
apache/  gpt/       monitoring/    relocate_osg_logs.log
edg/     gratia/    post-install/  vdt-install.log
globus/  MonaLisa/  relocate

OSG060/apache:
conf/  etc/  logs/

OSG060/apache/conf:
extra/  httpd.conf  original/

OSG060/apache/conf/extra:
httpd-autoindex.conf  httpd-languages.conf           httpd-ssl.conf
httpd-dav.conf        httpd-manual.conf              httpd-userdir.conf
httpd-default.conf    httpd-mpm.conf                 httpd-vhosts.conf
httpd-info.conf       httpd-multilang-errordoc.conf

OSG060/apache/conf/original:
extra/  httpd.conf

OSG060/apache/conf/original/extra:
httpd-autoindex.conf  httpd-languages.conf           httpd-ssl.conf
httpd-dav.conf        httpd-manual.conf              httpd-userdir.conf
httpd-default.conf    httpd-mpm.conf                 httpd-vhosts.conf
httpd-info.conf       httpd-multilang-errordoc.conf

OSG060/apache/etc:
pear.conf

OSG060/apache/logs:
mod_jk.log

OSG060/edg:
etc/

OSG060/edg/etc:
edg-mkgridmap.conf

OSG060/globus:
etc/  setup/  var/

OSG060/globus/etc:
globus-condor.conf                 grid-info-deployment-comments.conf
globus-fork.conf                   grid-info-resource-ldif.conf
globus-gatekeeper.conf             grid-info-resource-register.conf
globus_gram_local_proxy_tool.conf  grid-info-server-env.conf
globus-job-manager.conf            grid-info-site-giis.conf
globus_wsrf_test_unit/             grid-info-site-policy.conf
gridftp.conf                       grid-info-slapd.conf
gridftp-resource.conf              openldap/
grid-info.conf

OSG060/globus/etc/globus_wsrf_test_unit:
local-config-authz-test.conf

OSG060/globus/etc/openldap:
ldap.conf        ldapsearchprefs.conf  slapd.conf
ldapfilter.conf  ldaptemplates.conf

OSG060/globus/setup:
globus/

OSG060/globus/setup/globus:
config.log                          grid-info-resource-ldif.conf
globus_gaa.conf                     grid-info-resource-register.conf
globus_gaa_custom.conf              grid-info-server-env.conf
gridftp-resource.conf               grid-info-site-giis.conf
grid-info.conf                      grid-info-site-policy.conf
grid-info-deployment-comments.conf  grid-info-slapd.conf

OSG060/globus/var:
accounting.log  globus-condor.log  globus-fork.log  globus-gatekeeper.log  log/

OSG060/globus/var/log:
gridftp-auth.log  gridftp.log

OSG060/gpt:
etc/

OSG060/gpt/etc:
gpt/

OSG060/gpt/etc/gpt:
globus_flavor_labels.conf

OSG060/gratia:
var/

OSG060/gratia/var:
logs/

OSG060/gratia/var/logs:
gratia-probe-condor.log

OSG060/MonaLisa:
Service/

OSG060/MonaLisa/Service:
usr_code/  VDTFarm/

OSG060/MonaLisa/Service/usr_code:
VoModules-v0.36/  XDRUDP/

OSG060/MonaLisa/Service/usr_code/VoModules-v0.36:
testlogs/

OSG060/MonaLisa/Service/usr_code/VoModules-v0.36/testlogs:
VOgsiftpIO.log     VoJobs-FBS.log  VoJobs-PBS.log
VoJobs-CONDOR.log  VoJobs-LSF.log  VoJobs-SGE.log

OSG060/MonaLisa/Service/usr_code/XDRUDP:
XDRUDP.conf

OSG060/MonaLisa/Service/VDTFarm:
vdtFarm.conf

OSG060/monitoring:
osg-attributes.conf

OSG060/post-install:
gsi-authz.conf  prima-authz.conf

This was after running the relocate_OSG.sh script. Now I need to try to begin running services and doing the needed post-install configuration. I will likely identify additional files requiring relocation.

OK...the following files should also be relocated: *.pid, *.err, *.lock, *.properties. At this point I am going to go service by service:

MySQL

For the mysql service we need to move the whole var directory into opt:

  • mkdir /opt/OSG060/mysql
  • cp -arv /afs/atlas.umich.edu/OSG060/mysql/var /opt/OSG060/mysql/
  • mv /afs/atlas.umich.edu/OSG060/mysql/var /afs/atlas.umich.edu/OSG060/mysql/var.orig
  • ln -s /opt/OSG060/mysql/var /afs/atlas.umich.edu/OSG060/mysql/var

Then retry startup. Still fails. Set ownership of /opt/OSG060/mysql to mysql: chown -R mysql.osg /opt/OSG060/mysql. Still fails.

The problem is the vdt-app-data directory. We need to relocate it as well:

  • cp -arv /afs/atlas.umich.edu/OSG060/vdt-app-data /opt/OSG060/
  • mv /afs/atlas.umich.edu/OSG060/vdt-app-data /afs/atlas.umich.edu/OSG060/vdt-app-data.orig
  • ln -s /opt/OSG060/vdt-app-data /afs/atlas.umich.edu/OSG060/vdt-app-data
  • chown -R mysql.osg /opt/OSG060/vdt-app-data/mysql

Now MySQL starts OK.

MonALISA

Startup failed because the $VDT_LOCATION/!MonaLisa/Service/!VDTFarm/ML.log file was missing. The VDTFarm directory should be relocated. We need to "undo" the softlinks for files in this directory and move the whole directory. Once this was done MLD started OK.

We still needed to relocate two more MLD config fiies in $VDT_LOCATION/!MonaLisa/Service/VDTFarm/CMD: site_env and ml_env

Globus (globus-ws)

We need to redirect the whole $VDT_LOCATION/globus/var directory. Then we try vdt-control --off and then --on. The globus-ws still fails to start but all other services seem to be working. The error in ../var/container.log is:
Failed to start container: Container failed to initialize [Caused by: Address already in use]

The scripts

The scripts I used to relocate the initial AFS install are below:

relocate_OSG.sh
#!/bin/bash
#
# This script is meant to be run ONCE after installing OSG into an AFS area
# This script will 
#    1) make sure the user has 'admin' tokens in the AFS cell
#    2) source the OSG setup.sh file
#    3) locate any .log or .conf files and:
#        a) copy them to an equivalent location under a local dir
#        b) rename the original to <name>.orig
#        c) create a new soft-link from the <name> to its local location
#
#  Shawn McKee <smckee@umich.edu>, April 5, 2007
#######################################################

export LOCAL="/opt/OSG060"
export OSG="/afs/atlas.umich.edu/OSG060"

echo " This script is meant to be run ONCE after installing OSG into an"
echo " AFS location."

# Make sure we have AFS admin tokens
token=`tokens | grep "AFS ID" | awk '{print $4}' | awk -F\) '{print $1}'`

if [ $token -ne 1 ]
then
    kinit admin
    aklog
fi

# Goto install directory
cd $OSG

# Make sure "base" local directory exists
mkdir -p $LOCAL

# First lets relocate some whole directories:
#    $VDT_LOCATION/vdt-app-data
#    $VDT_LOCATION/MonaLisa/Service/VDTFarm
#    $VDT_LOCATION/globus/var
#

echo " Finding/relocating .conf files..."
find $VDT_LOCATION  -name "*.conf" -not -type l -not -path "*o..pacman*" -not -path "*.orig*"  -exec $VDT_LOCATION/relocate_file.sh '{}' $OSG $LOCAL \;
echo " Finding/relocating .log files..."
find $VDT_LOCATION  -name "*.log" -not -type l -not -path "*o..pacman*" -not -path "*.orig*" -exec $VDT_LOCATION/relocate_file.sh '{}' $OSG $LOCAL \;
echo " Finding/relocating .properties files..."
find $VDT_LOCATION  -name "*.properties" -not -type l -not -path "*o..pacman*" -not -path "*.orig*"  -exec $VDT_LOCATION/relocate_file.sh '{}' $OSG $LOCAL \;
echo " Finding/relocating .err files..."
find $VDT_LOCATION  -name "*.err" -not -type l -not -path "*o..pacman*" -not -path "*.orig*" -exec $VDT_LOCATION/relocate_file.sh '{}' $OSG $LOCAL \;
echo " Finding/relocating .lock files..."
find $VDT_LOCATION  -name "*.lock" -not -type l -not -path "*o..pacman*" -not -path "*.orig*" -exec $VDT_LOCATION/relocate_file.sh '{}' $OSG $LOCAL \;
echo " Finding/relocating .pid files..."
find $VDT_LOCATION  -name "*.pid" -not -type l -not -path "*o..pacman*" -not -path "*.orig*"  -exec $VDT_LOCATION/relocate_file.sh '{}' $OSG $LOCAL \;

# Find some SPECIFIC MonaLisa files which don't match the patter
echo " Finding/relocating ml_env files..."
find $VDT_LOCATION  -name "ml_env" -not -type l -not -path "*o..pacman*" -not -path "*.orig*" -exec $VDT_LOCATION/relocate_file.sh '{}' $OSG $LOCAL \;
echo " Finding/relocating site_env files..."
find $VDT_LOCATION  -name "site_env" -not -type l -not -path "*o..pacman*" -not -path "*.orig*"  -exec $VDT_LOCATION/relocate_file.sh '{}' $OSG $LOCAL \;

relocate_file.sh
#!/bin/bash
#
# This script is meant to be run ONCE after installing OSG into an AFS area
# This script will take inputs for
#   1) File to be relocated
#   2) Basedir of OSG install ($VDT_LOCATION)
#   3) Local redirect directory
# and:
#        a) copy file to an equivalent location under a local dir
#        b) rename the original to <name>.orig
#        c) create a new soft-link from the <name> to its local location
#
#  Shawn McKee <smckee@umich.edu>, April 5, 2007
#######################################################

# Inputs
file=$1
basedir=$2
local=$3

# Some manipulations to get correct locations
dest=$local"${file/$basedir/}"
destdir=`dirname $dest`

echo " "
echo " File $file"

#echo "    Mkdir: mkdir -p $destdir"
mkdir -p $destdir

#echo "    Copy: cp -a $file $destdir"
cp -a $file $destdir

newfile=$file.orig
#echo "    Rename: mv $file $newfile"
mv $file $newfile

destfile=$destdir/`basename $file`
#echo "    Soft-link: ln -s $destfile $file"
ln -s $destfile $file

To enable future installs I made a complete tar-ball of /opt/OSG060/* for deployment on a future host.

Post-Install Configuration

After making sure the vdt-control woiuld start all needed services we also ran the gums-host-cron manually: = /afs/atlas.umich.edu/OSG060/gums/bin/gums-host-cron=

Then we were ready to run the configure-osg.sh script. We answered all questions and it completed successfully.

Next we tried to get the authentication working using GUMS and Full Privilege mode. We need to edit the gums-client.properties file to make s ure we used linat02.grid.umich.edu. Next we had to edit the gums.config to allow *.aglt2.org to use this GUMS server (and then linat03 and linat04 as well). Then we could successfully generate a gridmapfile.

We then setup the gsi-authz.conf to use umfs02. When we tried to do a globusrun -a -r gate01.aglt2.org as "Shawn McKee" mapped to usatlas3 it failed with:

TIME: Thu Apr  5 13:32:05 2007
 PID: 11178 -- Notice: 0: GATEKEEPER_ACCT_FD=4 (/afs/atlas.umich.edu/OSG060/globus/var/accounting.log)
TIME: Thu Apr  5 13:32:05 2007
 PID: 11178 -- Notice: 6: Got connection 141.211.43.122 at Thu Apr  5 13:32:05 2007

GSS authentication failure 
GSS Major Status: General failure
GSS Minor Status Error Chain:
accept_sec_context.c:gss_accept_sec_context:396:
Error during delegation: Delegation protocol violation
Failure: GSS failed Major:000d0000 Minor:00000001 Token:00000000

I traced the problem to not having reverse lookups working:

[gate01:bin]# nslookup gate01.aglt2.org
Server:         198.108.1.42
Address:        198.108.1.42#53

Non-authoritative answer:
Name:   gate01.aglt2.org
Address: 192.41.230.11 

That worked but:

[gate01:bin]# nslookup 192.41.230.11
Server:         198.108.1.42
Address:        198.108.1.42#53

** server can't find 11.230.41.192.in-addr.arpa: NXDOMAIN

This didn't.

After getting the PTR records at Merit put in things now work:
[gate02] /afs/atlas.umich.edu/home/smckee > globusrun -a -r gate01.aglt2.org

GRAM Authentication test successful

Then testing as smckee failed in the next step:
[gate02] /afs/atlas.umich.edu/home/smckee > globus-job-run gate01.aglt2.org/jobmanager /usr/bin/id

WARNING: Invalid log file: "/afs/atlas.umich.edu/OSG060/globus/tmp/gram_job_state/gram_condor_log.23951.1175807310" (Permission denied)
GRAM Job failed because the job failed when the job manager attempted to run it (error code 17)

%NOTE% The problem is likely that the $GLOBUS_LOCATION/tmp directory needs to also be redirected. Doing that now.

Retesting and it now works:

(error code 17)
[gate02] /afs/atlas.umich.edu/home/smckee > globus-job-run gate01.aglt2.org/jobmanager /usr/bin/id
uid=789090(usatlas3) gid=55670(usatlas) groups=55670(usatlas)

I also checked jobmanager-fork and jobmanger-condor and both worked.

The tomcat-5 system is not starting because of another AFS issue: the $VDT_LOCATION/tomcat/v5/logs and ../temp directories must be redirected. Also the /etc/init.d/tomcat-5 script needs to be modified to put its .lock and .pid files in v5/temp rather than v5. Next we had to also redirect the conf and work directories. Next we found that $VDT_LOCATION/lcg/var also needed to be redirected.

Seems to start now.

Site Verfiy

I ran site-verify as my DN and things seem OK.

===============================================================================
Info: Site verification initiated at Thu Apr  5 22:28:18 2007 GMT.
===============================================================================
-------------------------------------------------------------------------------
------------ Begin gate01.aglt2.org at Thu Apr  5 22:28:18 2007 GMT -----------
-------------------------------------------------------------------------------
Checking prerequisites needed for testing: PASS
Checking for a valid proxy for smckee@gate01.aglt2.org: PASS
Checking if remote host is reachable: PASS
Checking for a running gatekeeper: YES; port 2119
Checking authentication: PASS
Checking 'Hello, World' application: PASS
Checking remote host uptime: PASS
   18:28:24 up 20 days,  5:15,  2 users,  load average: 0.00, 0.04, 0.04
Checking remote Internet network services list: PASS
Checking remote Internet servers database configuration: PASS
Checking for GLOBUS_LOCATION: /afs/atlas.umich.edu/OSG060/globus
Checking expiration date of remote host certificate: Apr  1 16:09:18 2008 GMT
Checking for gatekeeper configuration file: YES
  /afs/atlas.umich.edu/OSG060/globus/etc/globus-gatekeeper.conf
Checking users in grid-mapfile, if none must be using Prima: compbiogrid,des,dosar,engage,fermilab,fmri,gadu,geant4,glow,gpn,grase,gridex,grow,gugrid,ivdgl,ligo,mariachi,mis,nanohub,nwicg,ops,osg,osgedu,sam,samgrid,sdss,star,usatlas3,usatlas4
Checking for remote globus-sh-tools-vars.sh: YES
Checking configured grid services: PASS
  jobmanager,jobmanager-condor,jobmanager-fork,jobmanager-managedfork
Checking for OSG osg-attributes.conf: YES
Checking scheduler types associated with remote jobmanagers: PASS
  jobmanager is of type managedfork
  jobmanager-condor is of type condor
  jobmanager-fork is of type managedfork
  jobmanager-managedfork is of type managedfork
Checking for paths to binaries of remote schedulers: PASS
  Path to condor binaries is /opt/condor/bin
  Path to managedfork binaries is $env/gratia/var/data
Checking remote scheduler status: PASS
  condor : 1 jobs running, 0 jobs idle/pending
Checking if Globus is deployed from the VDT: YES; version 1.6.1d
Checking for OSG version: YES; version 0.6.0
Checking for OSG grid3-user-vo-map.txt: YES
  ivdgl users: ivdgl
  i2u2 users: i2u2
  geant4 users: geant4
  grow users: grow
  osgedu users: osgedu
  nanohub users: nanohub
  gridex users: gridex
  fmri users: fmri
  DOSAR users: dosar
  osg users: osg
  usatlas users: usatlas1,usatlas2,usatlas3,usatlas4
  LIGO users: ligo
  star users: star
  uscms users: uscms02,uscms01
  grase users: grase
  glow users: glow
  fermilab users: fermilab
  dzero users: sam,samgrid
  mis users: mis
  des users: des
  sdss users: sdss
  gadu users: gadu
Checking for OSG site name: AGLT2
Checking for OSG $GRID3 definition: /afs/atlas.umich.edu/OSG060
Checking for OSG $OSG_GRID definition: /afs/atlas.umich.edu/OSG060
Checking for OSG $APP definition: /atlas/data08/OSG/APP
Checking for OSG $DATA definition: /atlas/data08/OSG/DATA
Checking for OSG $TMP definition: /atlas/data08/OSG/DATA
Checking for OSG $WNTMP definition: /tmp
Checking for OSG $OSG_GRID existence: PASS
Checking for OSG $APP existence: PASS
Checking for OSG $DATA existence: PASS
Checking for OSG $TMP existence: PASS
Checking for OSG $APP writability: PASS
Checking for OSG $DATA writability: PASS
Checking for OSG $TMP writability: PASS
Checking for OSG $APP available space: 227.954 GB
Checking for OSG $DATA available space: 227.954 GB
Checking for OSG $TMP available space: 227.954 GB
Checking for OSG additional site-specific variable definitions: YES
  <No Location List Name>
    ATLAS_APP prod /atlas/data08/OSG/APP/atlas_app
    ATLAS_DATA prod /atlas/data08/OSG/DATA/atlas_data
    ATLAS_DQ2Cli prod /atlas/data08/OSG/DATA/atlas_app/dq2_cli/DQ2Cli
    ATLAS_LOC_1103 11.0.3 /atlas/data08/OSG/APP/atlas_app/atlas_rel/11.0.3
    ATLAS_LOC_11042 11.0.42 /atlas/data08/OSG/APP/atlas_app/atlas_rel/11.0.42
    ATLAS_LOC_1105 11.0.5 /atlas/data08/OSG/APP/atlas_app/atlas_rel/11.0.5
    ATLAS_LOC_1201 12.0.1 /atlas/data08/OSG/APP/atlas_app/atlas_rel/12.0.1
    ATLAS_LOC_1202 12.0.2 /atlas/data08/OSG/APP/atlas_app/atlas_rel/12.0.2
    ATLAS_LOC_1203 12.0.3 /atlas/data08/OSG/APP/atlas_app/atlas_rel/12.0.3
    ATLAS_LOC_1204 12.0.4 /atlas/data08/OSG/APP/atlas_app/atlas_rel/12.0.4
    ATLAS_LOC_1205 12.0.5 /atlas/data08/OSG/APP/atlas_app/atlas_rel/12.0.5
    ATLAS_LOC_1206 12.0.6 /atlas/data08/OSG/APP/atlas_app/atlas_rel/12.0.6
    ATLAS_LOC_1230 12.3.0 /atlas/data08/OSG/APP/atlas_app/atlas_rel/12.3.0
    ATLAS_LOC_GCC 3.2 /atlas/data08/OSG/APP/atlas_app/gcc32
    ATLAS_LOC_GCE prod /atlas/data08/OSG/APP/atlas_app/GCE-Server/gce-server
    ATLAS_LOC_KitVal prod /atlas/data08/OSG/APP/atlas_app/atlas_rel/kitval/KitValidation
    ATLAS_LOC_Trfs prod /atlas/data08/OSG/APP/atlas_app/Atlas-Trfs/atlas-trfs
    ATLAS_PYTHONHOME prod /atlas/data08/OSG/DATA/atlas_app/python
    ATLAS_STAGE prod /atlas/data08/OSG/DATA/atlas_data
Checking for OSG execution jobmanager(s): gate01.aglt2.org/jobmanager-condor
Checking for OSG utility jobmanager(s): gate01.aglt2.org/jobmanager
Checking for OSG sponsoring VO: usatlas:80 local:20
Checking for OSG policy expression: NONE
Checking for OSG setup.sh: YES
Checking for OSG $Monalisa_HOME definition: /afs/atlas.umich.edu/OSG060/MonaLisa
Checking for MonALISA configuration: PASS
  key ml_env vars:
    FARM_NAME = AGLT2
    FARM_HOME = /afs/atlas.umich.edu/OSG060/MonaLisa/Service/VDTFarm
    FARM_CONF_FILE = /afs/atlas.umich.edu/OSG060/MonaLisa/Service/VDTFarm/vdtFarm.conf
    SHOULD_UPDATE = false
    URL_LIST_UPDATE = http://monalisa.cacr.caltech.edu/FARM_ML,http://monalisa.cern.ch/MONALISA/FARM_ML
  key ml_properties vars:
    lia.Monitor.group = OSG
    lia.Monitor.useIPaddress = undef
    MonaLisa.ContactEmail = smckee@umich.edu
Checking for a running MonALISA: PASS
  MonALISA is ALIVE (pid 3843)
  MonALISA_Version = 1.6.8-200611241031
  MonALISA_VDate = 2006-11-24
  VoModulesDir = VoModules-v0.36
  tcpServer_Port = 9002
  storeType = epgsqldb
Checking for a running GANGLIA gmond daemon: PASS (pid 32215 ...)
  /opt/ganglia/sbin/gmond
  name "UMOR"
  owner "UM ATLAS Physics"
  url "http://umopt1.grid.umich.edu/"
Checking for a running GANGLIA gmetad daemon: PASS (pid 3324 ...)
  /usr/sbin/gmetad
  trusted_hosts 127.0.0.1 141.211.43.112 10.10.1.1
Checking for a running gsiftp server: YES; port 2811
Checking gsiftp (local client, local host -> remote host): PASS
Checking gsiftp (local client, remote host -> local host): PASS
Checking that no differences exist between gsiftp'd files: PASS
Checking for VDS existence: PASS
Checking for VDS kickstart existence: PASS
Checking for VDS k.2 (OPTIONAL) existence: PASS
Checking for VDS dirmanager existence: PASS
Checking for VDS invoke existence: PASS
Checking for VDS transfer existence: PASS
Checking for VDS T2 (OPTIONAL) existence: PASS
Checking for VDS seqexec existence: PASS
Checking for VDS mpiexec (OPTIONAL) existence: FAIL
-------------------------------------------------------------------------------
------------- End gate01.aglt2.org at Thu Apr  5 22:33:55 2007 GMT ------------
-------------------------------------------------------------------------------
===============================================================================
Info: Site verification completed at Thu Apr  5 22:33:55 2007 GMT.

Next I registered with OSG.

-- ShawnMcKee - 02 Apr 2007
Topic revision: r28 - 16 Oct 2009, TomRockwell
This site is powered by FoswikiCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Foswiki? Send feedback