Installation of OSG 0.6.0 on gate01.aglt2.org
The installation procedure for OSG 0.6.0 on gate01.aglt2.org is below. It was installed on April 2nd, 2007.
Please refer to the
OSG CE Installation Twiki for details about the procedure.
Setup and Preparation
First the gate01.aglt2.org node was updated via yum and the network setup was converted to use aglt2.org instead of the original grid.umich.edu.
A set of DOE Grids certificates where obtained for
- Host: gate01.aglt2.org in /etc/grid-security
- LDAP: ldap/gate01.aglt2.org in /etc/grid-security/ldap
- HTTP: http/gate01.aglt2.org in /etc/grid-security/http
The script that was used to install VDT160 was updated on gate01.aglt2.org. It is in /root/install_osg.sh (original script was called install_vdt.sh):
#!/bin/bash
#
# Make sure we setup/install OK in AFS space for 32 bit
#
#
export VDT_PRETEND_32=1
# To try to preserve some settings from a prior install set this...
#export OLD_VDT_LOCATION=/afs/atlas.umich.edu/OSG
# Use existing CONDOR install
export VDTSETUP_CONDOR_LOCATION=/opt/condor
export VDTSETUP_CONDOR_CONFIG=$VDTSETUP_CONDOR_LOCATION/etc/condor_config
# Setup Pacman
wget http://physics.bu.edu/pacman/sample_cache/tarballs/pacman-3.19.tar.gz
tar --no-same-owner -xzvf pacman-3.19.tar.gz
cd /opt/pacman-3.19
source setup.sh
# Make sure we have AFS admin tokens
token=`tokens | grep "AFS ID" | awk '{print $4}' | awk -F\) '{print $1}'`
if [ $token -ne 1 ]
then
kinit admin
aklog
fi
# Create volume for install
vos create linat08.grid.umich.edu /vicepg OSG060 10000000 -verbose
fs mkmount /afs/.atlas.umich.edu/OSG060 OSG060 -rw
vos release root.cell
vos release root.afs
fs checkvolumes
# Goto install directory
cd /afs/atlas.umich.edu/OSG060
# Set AFS ACLs to allow full access to everything during install
fs setacl /afs/atlas.umich.edu/OSG060 system:anyuser rliwd
# set umask
umask 0022
# Do installation and asnwer questions
pacman -get OSG:ce
echo " Finished OSG:ce install "
echo " "
echo " "
# Done? Reset ACLs
echo "Must reset ACLs on AFS install area..."
find /afs/atlas.umich.edu/OSG060 -type d -exec /usr/bin/fs setacl {} system:anyuser rl \;
# Setup environment
cd /afs/atlas.umich.edu/OSG060
source setup.sh
# Get Condor-Setup
echo " "
echo " Installing OSG:Globus-Condor-Setup "
echo " "
pacman -get OSG:Globus-Condor-Setup
# Setup managed fork
echo " "
echo " Installing OSG:ManagedFork "
echo " "
pacman -get OSG:ManagedFork
echo " "
echo " Done with Pacman installs "
echo " "
# Configure default jobmanager to be managed fork
$VDT_LOCATION/vdt/setup/configure_globus_gatekeeper --managed-fork y --server y
# Protect certificates so services can get them...
chown -R daemon.daemon /etc/grid-security/ldap
chown -R daemon.daemon /etc/grid-security/http
Script was started around 1:35 PM. Answered with defaults.
It finished successfully (no errors) around 2:38 PM.
Post-Installation work.
There are a number of configuration and verification steps needed after the installation.
We would like to run the new managed fork capability so this will need to be installed and configured (see below).
All needed certificates were already obtained (see above) for the host, ldap and http. There is some setup noted:
Before you can request user, host or service certificates with:
/afs/atlas.umich.edu/OSG060/globus/bin/grid-cert-request
you must first configure your GSI settings with
/afs/atlas.umich.edu/OSG060/vdt/setup/setup-cert-request
Running this gives:
[gate01:grid-security]# source /afs/atlas.umich.edu/OSG060/setup.sh
[gate01:grid-security]# /afs/atlas.umich.edu/OSG060/vdt/setup/setup-cert-request
Reading from /afs/atlas.umich.edu/OSG060/globus/TRUSTED_CA
Using hash: 1c3f2ca8
Setting up grid-cert-request
Running grid-security-config...
Before you use the Grid Security Infrastructure, you should first
define the DN (distinguished name) that should be used for your
organization's X509 certificates. If you do not define a DN,
a default DN will be assigned to you.
For some questions, a default response is given in [].
Pressing RETURN in response to such a question will enable the default.
This script will overwrite the file --
/afs/atlas.umich.edu/OSG060/globus/etc/grid-security.conf
========================================================================
(1) Base DN for user certificates
[ OU=People,DC=doegrids,DC=org ]
(2) Base DN for host certificates
[ OU=Services,DC=doegrids,DC=org ]
========================================================================
(q) save, configure the GSI and Quit
(c) Cancel (exit without saving or configuring)
(h) Help
========================================================================
q
Successfully created cert request configuration files in:
/afs/atlas.umich.edu/OSG060/globus/etc
I then setup the
ManagedFork limits by editing /opt/condor/etc/condor_config.local and adding
# ManagedFork limit
START_LOCAL_UNIVERSE = TotalLocalJobsRunning < 20 || GridMonitorJob =?= TRUE
Then I noticed the /etc/grid-security/certificates was pointing to the old VDT160 install location. I reset this to be the OSG060 one:
ln -s /afs/atlas.umich.edu/OSG060/globus/share/certificates /etc/grid-security/certificates
The instructions say to check the /etc/xinetd.d files (which were old). But it seems you need to run vdt-control first:
[gate01:OSG060]# which vdt-control
/afs/atlas.umich.edu/OSG060/vdt/sbin/vdt-control
[gate01:OSG060]# vdt-control --on
enabling cron service fetch-crl... ok
enabling cron service vdt-rotate-logs... ok
skipping init service 'gris' -- marked as disabled
enabling inetd service globus-gatekeeper... FAILED! (see vdt-install.log)
found conflicting, non-VDT entry for service globus-gatekeeper in /etc/services;
use the --force option to remove the entry
enabling inetd service gsiftp... FAILED! (see vdt-install.log)
found conflicting, non-VDT entry for service gsiftp in /etc/services;
use the --force option to remove the entry
enabling init service mysql... FAILED! (see vdt-install.log)
conflicting, non-VDT file: /etc/rc.d/init.d/mysql
use the --force option to backup and overwrite
enabling init service globus-ws... FAILED! (see vdt-install.log)
conflicting, non-VDT file: /etc/rc.d/init.d/globus-ws
use the --force option to backup and overwrite
skipping cron service 'edg-mkgridmap' -- marked as disabled
enabling cron service gums-host-cron... ok
skipping init service 'MLD' -- marked as disabled
enabling init service apache... FAILED! (see vdt-install.log)
conflicting, non-VDT file: /etc/rc.d/init.d/apache
use the --force option to backup and overwrite
enabling init service tomcat-5... FAILED! (see vdt-install.log)
conflicting, non-VDT file: /etc/rc.d/init.d/tomcat-5
use the --force option to backup and overwrite
enabling cron service gratia-condor... ok
So it seems there are some issues to resolve. I first edited my 'root' crontab to remove the old VDT160 entries. I ended up redoing the command as
vdt-control --on --force
which backs up and overwrites all services.
I turned back on MLD via:
vdt-register-service --name MLD --enable
The errors from
vdt-control
are because of AFS issues. See the section below on how we resolved this.
Installation on AFS Issues
Since we are installing in AFS we have to fix some issues. The rationale for using AFS is that other clients can utilize this installation. However because files are in AFS we are preventing (without getting appropriate tokens) from writing files, including log files. Additionally if multiple clients were to use this installation they would each require their own config/setup information.
To make this setup functional requires us to locate all files (or even whole directories) which have "log"-type files or configuration information and soft-link them to a local (writeable) filesystem. I choose /opt/OSG060 as the base of the soft-link area. We need to identify every file/directory which must be written OR contains configuration information and:
- Make a copy of this file or directory into /opt/OSG060/...
- Rename the original in AFS to <filename>_orig
- Create a soft-link in afs to the new location /opt/OSG060/...
Note that this procedure must be done for the FIRST installation. Succeeding installations need to copy (and edit) the files in <filename>_orig to the correct name in their local /opt/OSG060 area so the existing soft-links in AFS point to the correct location. Of course files requiring edits to setup the correct configuration for this new host must also be done.
First attempt to locate all needed files for relocation:
[gate01:globus]# pwd
/afs/atlas.umich.edu/OSG060/globus
[gate01:globus]# cd ..
[gate01:OSG060]# ffind .log
./o..pacman..o/logs/pacman.log
./o..pacman..o/logs/wget.log
./o..pacman..o/logs/shellout.log
./vdt-install.log
./vdt/etc/vdt.logrotate
./vdt/backup/vdt/vdt/etc/vdt.logrotate_001_20070402-214336
./vdt/backup/vdt/vdt/etc/vdt.logrotate_002_20070402-214525
./vdt/backup/vdt/vdt/etc/vdt.logrotate_003_20070402-214550
./vdt/backup/vdt/vdt/etc/vdt.logrotate_004_20070402-221532
./vdt/backup/vdt/vdt/etc/vdt.logrotate_005_20070402-221758
./vdt/backup/vdt/vdt/etc/vdt.logrotate_006_20070402-223331
./vdt/backup/vdt/vdt/etc/vdt.logrotate_007_20070405-073139
./globus/var/globus-fork.log
./globus/var/log/gridftp.log
./globus/var/log/gridftp-auth.log
./globus/var/gridftp.log
./globus/var/globus-condor.log
./globus/var/globus-gatekeeper.log
./globus/var/accounting.log
./globus/setup/globus/config.log
./MonaLisa/Service/usr_code/VoModules-v0.36/testdata/gridftp.log.SAVE
./MonaLisa/Service/usr_code/VoModules-v0.36/testlogs/VoJobs-SGE.log
./MonaLisa/Service/usr_code/VoModules-v0.36/testlogs/VoJobs-FBS.log
./MonaLisa/Service/usr_code/VoModules-v0.36/testlogs/VoJobs-LSF.log
./MonaLisa/Service/usr_code/VoModules-v0.36/testlogs/VoJobs-CONDOR.log
./MonaLisa/Service/usr_code/VoModules-v0.36/testlogs/VoJobs-PBS.log
./MonaLisa/Service/usr_code/VoModules-v0.36/testlogs/VOgsiftpIO.log
./apache/logs/mod_jk.log
./gratia/var/logs/gratia-probe-condor.log
We also need to find all configuration files:
[gate01:OSG060]# ffind ".conf"
./vdt/etc/package_data/Fetch-CRL.configdiff
./vdt/etc/package_data/Globus-Base-Info-Server.configdiff
./vdt/etc/package_data/Globus-Base-RM-Server.configdiff
./vdt/etc/package_data/Globus-Base-Data-Server.configdiff
./vdt/etc/package_data/Globus-Base-WS-Essentials.configdiff
./vdt/etc/package_data/Globus-Base-RFT-Server.configdiff
./vdt/etc/package_data/Globus-Base-WSGRAM-Server.configdiff
./vdt/etc/package_data/GUMS-Client.configdiff
./vdt/etc/package_data/Job-Environment.configdiff
./vdt/etc/package_data/MonaLisa.configdiff
./vdt/etc/package_data/Apache.configdiff
./vdt/etc/package_data/Tomcat-5.configdiff
./vdt/etc/package_data/CEMon.configdiff
./vdt/etc/package_data/Gratia.configdiff
./vdt/backup/vdt/globus/etc/globus-job-manager.conf_001_20070402-214525
./vdt/backup/vdt/globus/etc/globus-gatekeeper.conf_001_20070402-214525
./vdt/backup/vdt/globus/etc/globus-gatekeeper.conf_002_20070402-223331
./vdt/backup/vdt/globus/etc/globus-gatekeeper.conf_003_20070405-073138
./vdt/backup/vdt/apache/conf/httpd.conf_001_20070402-221531
./vdt/backup/vdt/apache/conf/extra/httpd-ssl.conf_001_20070402-221531
./vdt/backup/vdt/apache/conf/httpd.conf_002_20070402-221758
./vdt/backup/vdt/apache/conf/httpd.conf_003_20070402-221826
./monitoring/osg-attributes.conf
./monitoring/grid3-info.conf
./globus/etc/grid3-info.conf
./globus/etc/osg-attributes.conf
./globus/etc/openldap/ldap.conf
./globus/etc/openldap/ldap.conf.default
./globus/etc/openldap/ldapfilter.conf
./globus/etc/openldap/ldapfilter.conf.default
./globus/etc/openldap/ldaptemplates.conf
./globus/etc/openldap/ldaptemplates.conf.default
./globus/etc/openldap/ldapsearchprefs.conf
./globus/etc/openldap/ldapsearchprefs.conf.default
./globus/etc/openldap/slapd.conf.default
./globus/etc/openldap/slapd.conf
./globus/etc/grid-info.conf
./globus/etc/grid-info-resource-ldif.conf
./globus/etc/grid-info-resource-register.conf
./globus/etc/gridftp-resource.conf
./globus/etc/grid-info-slapd.conf
./globus/etc/grid-info-site-giis.conf
./globus/etc/grid-info-site-policy.conf
./globus/etc/grid-info-server-env.conf
./globus/etc/grid-info-deployment-comments.conf
./globus/etc/globus-gatekeeper.conf
./globus/etc/globus-fork.conf
./globus/etc/globus-job-manager.conf
./globus/etc/gridftp.conf
./globus/etc/globus_wsrf_test_unit/local-config-authz-test.conf
./globus/etc/globus_gram_local_proxy_tool.conf
./globus/etc/globus-condor.conf
./globus/etc/grid-security.conf
./globus/etc/globus-user-ssl.conf
./globus/etc/globus-host-ssl.conf
./globus/share/certificates/doegrids/globus-host-ssl.conf.1c3f2ca8
./globus/share/certificates/doegrids/globus-user-ssl.conf.1c3f2ca8
./globus/share/certificates/doegrids/grid-security.conf.1c3f2ca8
./globus/share/certificates/doegrids.orig/globus-host-ssl.conf.1c3f2ca8
./globus/share/certificates/doegrids.orig/globus-user-ssl.conf.1c3f2ca8
./globus/share/certificates/doegrids.orig/grid-security.conf.1c3f2ca8
./globus/share/certificates/globus-host-ssl.conf.1c3f2ca8
./globus/share/certificates/globus-user-ssl.conf.1c3f2ca8
./globus/share/certificates/grid-security.conf.1c3f2ca8
./globus/share/myproxy/myproxy-server.config
./globus/share/myproxy/etc.inetd.conf.modifications
./globus/man/man5/ldap.conf.5
./globus/man/man5/ldapfilter.conf.5
./globus/man/man5/ldapsearchprefs.conf.5
./globus/man/man5/ldaptemplates.conf.5
./globus/man/man5/slapd.conf.5
./globus/man/man5/ud.conf.5
./globus/man/man5/myproxy-server.config.5
./globus/setup/globus/globus_gaa.conf
./globus/setup/globus/globus_gaa_custom.conf
./globus/setup/globus/gsi-gaa.conf.tmpl
./globus/setup/globus/grid-info.conf.in
./globus/setup/globus/grid-info.conf
./globus/setup/globus/grid-info-resource-ldif.conf.in
./globus/setup/globus/grid-info-resource-register.conf.in
./globus/setup/globus/grid-info-slapd.conf.in
./globus/setup/globus/grid-info-site-giis.conf.in
./globus/setup/globus/grid-info-site-policy.conf.in
./globus/setup/globus/grid-info-server-env.conf.in
./globus/setup/globus/gridftp-resource.conf.in
./globus/setup/globus/grid-info-deployment-comments.conf
./globus/setup/globus/grid-info-resource-ldif.conf
./globus/setup/globus/grid-info-resource-register.conf
./globus/setup/globus/grid-info-slapd.conf
./globus/setup/globus/grid-info-server-env.conf
./globus/setup/globus/gridftp-resource.conf
./globus/setup/globus/grid-info-site-giis.conf
./globus/setup/globus/grid-info-site-policy.conf
./post-install/gsi-authz.conf
./post-install/prima-authz.conf
./gpt/etc/gpt/globus_flavor_labels.conf
./lcg/etc/add-attributes.conf.example
./lcg/etc/alter-attributes.conf.example
./edg/etc/edg-mkgridmap.conf
./edg/etc/edg-mkgridmap.conf.orig
./edg/share/doc/edg-mkgridmap-conf-2.8.1/html/edg-mkgridmap.conf.html
./edg/share/man/man5/edg-mkgridmap.conf.5.gz
./glite/etc/glite-ce-ce-plugin/lcg-info-generic.conf.example.lsf
./glite/etc/glite-ce-ce-plugin/lcg-info-generic.conf.example.pbs
./MonaLisa/Service/usr_code/XDRUDP/XDRUDP.conf
./MonaLisa/Service/usr_code/NetFlowModule/NetFlow.config
./MonaLisa/Service/VDTFarm/vdtFarm.conf
./MonaLisa/Service/VDTFarm/db.conf.embedded
./python/lib/python2.3/config/Setup.config
./apache/etc/pear.conf
./apache/conf/original/extra/httpd-userdir.conf
./apache/conf/original/extra/httpd-mpm.conf
./apache/conf/original/extra/httpd-multilang-errordoc.conf
./apache/conf/original/extra/httpd-manual.conf
./apache/conf/original/extra/httpd-ssl.conf
./apache/conf/original/extra/httpd-autoindex.conf
./apache/conf/original/extra/httpd-info.conf
./apache/conf/original/extra/httpd-dav.conf
./apache/conf/original/extra/httpd-vhosts.conf
./apache/conf/original/extra/httpd-languages.conf
./apache/conf/original/extra/httpd-default.conf
./apache/conf/original/httpd.conf
./apache/conf/httpd.conf.bak
./apache/conf/extra/httpd-userdir.conf
./apache/conf/extra/httpd-mpm.conf
./apache/conf/extra/httpd-multilang-errordoc.conf
./apache/conf/extra/httpd-manual.conf
./apache/conf/extra/httpd-ssl.conf
./apache/conf/extra/httpd-autoindex.conf
./apache/conf/extra/httpd-info.conf
./apache/conf/extra/httpd-dav.conf
./apache/conf/extra/httpd-vhosts.conf
./apache/conf/extra/httpd-languages.conf
./apache/conf/extra/httpd-default.conf
./apache/conf/httpd.conf
I created a shell script (bash) for the first-time relocations needed for all such files. It is meant to be run from $VDT_LOCATION only once after you have done the initial install into an AFS location. You need to specify the "redirect" local directory which will host the writeable files and be soft-linked to.
The list of files/directories for OSG 0.6.0 (VDT 1.6.1) is:
[gate01:opt]# ls -R OSG060/
OSG060/:
apache/ gpt/ monitoring/ relocate_osg_logs.log
edg/ gratia/ post-install/ vdt-install.log
globus/ MonaLisa/ relocate
OSG060/apache:
conf/ etc/ logs/
OSG060/apache/conf:
extra/ httpd.conf original/
OSG060/apache/conf/extra:
httpd-autoindex.conf httpd-languages.conf httpd-ssl.conf
httpd-dav.conf httpd-manual.conf httpd-userdir.conf
httpd-default.conf httpd-mpm.conf httpd-vhosts.conf
httpd-info.conf httpd-multilang-errordoc.conf
OSG060/apache/conf/original:
extra/ httpd.conf
OSG060/apache/conf/original/extra:
httpd-autoindex.conf httpd-languages.conf httpd-ssl.conf
httpd-dav.conf httpd-manual.conf httpd-userdir.conf
httpd-default.conf httpd-mpm.conf httpd-vhosts.conf
httpd-info.conf httpd-multilang-errordoc.conf
OSG060/apache/etc:
pear.conf
OSG060/apache/logs:
mod_jk.log
OSG060/edg:
etc/
OSG060/edg/etc:
edg-mkgridmap.conf
OSG060/globus:
etc/ setup/ var/
OSG060/globus/etc:
globus-condor.conf grid-info-deployment-comments.conf
globus-fork.conf grid-info-resource-ldif.conf
globus-gatekeeper.conf grid-info-resource-register.conf
globus_gram_local_proxy_tool.conf grid-info-server-env.conf
globus-job-manager.conf grid-info-site-giis.conf
globus_wsrf_test_unit/ grid-info-site-policy.conf
gridftp.conf grid-info-slapd.conf
gridftp-resource.conf openldap/
grid-info.conf
OSG060/globus/etc/globus_wsrf_test_unit:
local-config-authz-test.conf
OSG060/globus/etc/openldap:
ldap.conf ldapsearchprefs.conf slapd.conf
ldapfilter.conf ldaptemplates.conf
OSG060/globus/setup:
globus/
OSG060/globus/setup/globus:
config.log grid-info-resource-ldif.conf
globus_gaa.conf grid-info-resource-register.conf
globus_gaa_custom.conf grid-info-server-env.conf
gridftp-resource.conf grid-info-site-giis.conf
grid-info.conf grid-info-site-policy.conf
grid-info-deployment-comments.conf grid-info-slapd.conf
OSG060/globus/var:
accounting.log globus-condor.log globus-fork.log globus-gatekeeper.log log/
OSG060/globus/var/log:
gridftp-auth.log gridftp.log
OSG060/gpt:
etc/
OSG060/gpt/etc:
gpt/
OSG060/gpt/etc/gpt:
globus_flavor_labels.conf
OSG060/gratia:
var/
OSG060/gratia/var:
logs/
OSG060/gratia/var/logs:
gratia-probe-condor.log
OSG060/MonaLisa:
Service/
OSG060/MonaLisa/Service:
usr_code/ VDTFarm/
OSG060/MonaLisa/Service/usr_code:
VoModules-v0.36/ XDRUDP/
OSG060/MonaLisa/Service/usr_code/VoModules-v0.36:
testlogs/
OSG060/MonaLisa/Service/usr_code/VoModules-v0.36/testlogs:
VOgsiftpIO.log VoJobs-FBS.log VoJobs-PBS.log
VoJobs-CONDOR.log VoJobs-LSF.log VoJobs-SGE.log
OSG060/MonaLisa/Service/usr_code/XDRUDP:
XDRUDP.conf
OSG060/MonaLisa/Service/VDTFarm:
vdtFarm.conf
OSG060/monitoring:
osg-attributes.conf
OSG060/post-install:
gsi-authz.conf prima-authz.conf
This was after running the relocate_OSG.sh script. Now I need to try to begin running services and doing the needed post-install configuration. I will likely identify additional files requiring relocation.
OK...the following files should also be relocated: *.pid, *.err, *.lock, *.properties. At this point I am going to go service by service:
MySQL
For the mysql service we need to move the whole var directory into opt:
- mkdir /opt/OSG060/mysql
- cp -arv /afs/atlas.umich.edu/OSG060/mysql/var /opt/OSG060/mysql/
- mv /afs/atlas.umich.edu/OSG060/mysql/var /afs/atlas.umich.edu/OSG060/mysql/var.orig
- ln -s /opt/OSG060/mysql/var /afs/atlas.umich.edu/OSG060/mysql/var
Then retry startup. Still fails. Set ownership of /opt/OSG060/mysql to mysql:
chown -R mysql.osg /opt/OSG060/mysql
. Still fails.
The problem is the vdt-app-data directory. We need to relocate it as well:
- cp -arv /afs/atlas.umich.edu/OSG060/vdt-app-data /opt/OSG060/
- mv /afs/atlas.umich.edu/OSG060/vdt-app-data /afs/atlas.umich.edu/OSG060/vdt-app-data.orig
- ln -s /opt/OSG060/vdt-app-data /afs/atlas.umich.edu/OSG060/vdt-app-data
- chown -R mysql.osg /opt/OSG060/vdt-app-data/mysql
Now MySQL starts OK.
MonALISA
Startup failed because the $VDT_LOCATION/!MonaLisa/Service/!VDTFarm/ML.log file was missing. The VDTFarm directory should be relocated. We need to "undo" the softlinks for files in this directory and move the whole directory. Once this was done MLD started OK.
We still needed to relocate two more MLD config fiies in $VDT_LOCATION/!MonaLisa/Service/VDTFarm/CMD: site_env and ml_env
Globus (globus-ws)
We need to redirect the whole $VDT_LOCATION/globus/var directory. Then we try vdt-control --off and then --on. The globus-ws still fails to start but all other services seem to be working. The error in ../var/container.log is:
Failed to start container: Container failed to initialize [Caused by: Address already in use]
The scripts
The scripts I used to relocate the initial AFS install are below:
relocate_OSG.sh
#!/bin/bash
#
# This script is meant to be run ONCE after installing OSG into an AFS area
# This script will
# 1) make sure the user has 'admin' tokens in the AFS cell
# 2) source the OSG setup.sh file
# 3) locate any .log or .conf files and:
# a) copy them to an equivalent location under a local dir
# b) rename the original to <name>.orig
# c) create a new soft-link from the <name> to its local location
#
# Shawn McKee <smckee@umich.edu>, April 5, 2007
#######################################################
export LOCAL="/opt/OSG060"
export OSG="/afs/atlas.umich.edu/OSG060"
echo " This script is meant to be run ONCE after installing OSG into an"
echo " AFS location."
# Make sure we have AFS admin tokens
token=`tokens | grep "AFS ID" | awk '{print $4}' | awk -F\) '{print $1}'`
if [ $token -ne 1 ]
then
kinit admin
aklog
fi
# Goto install directory
cd $OSG
# Make sure "base" local directory exists
mkdir -p $LOCAL
# First lets relocate some whole directories:
# $VDT_LOCATION/vdt-app-data
# $VDT_LOCATION/MonaLisa/Service/VDTFarm
# $VDT_LOCATION/globus/var
#
echo " Finding/relocating .conf files..."
find $VDT_LOCATION -name "*.conf" -not -type l -not -path "*o..pacman*" -not -path "*.orig*" -exec $VDT_LOCATION/relocate_file.sh '{}' $OSG $LOCAL \;
echo " Finding/relocating .log files..."
find $VDT_LOCATION -name "*.log" -not -type l -not -path "*o..pacman*" -not -path "*.orig*" -exec $VDT_LOCATION/relocate_file.sh '{}' $OSG $LOCAL \;
echo " Finding/relocating .properties files..."
find $VDT_LOCATION -name "*.properties" -not -type l -not -path "*o..pacman*" -not -path "*.orig*" -exec $VDT_LOCATION/relocate_file.sh '{}' $OSG $LOCAL \;
echo " Finding/relocating .err files..."
find $VDT_LOCATION -name "*.err" -not -type l -not -path "*o..pacman*" -not -path "*.orig*" -exec $VDT_LOCATION/relocate_file.sh '{}' $OSG $LOCAL \;
echo " Finding/relocating .lock files..."
find $VDT_LOCATION -name "*.lock" -not -type l -not -path "*o..pacman*" -not -path "*.orig*" -exec $VDT_LOCATION/relocate_file.sh '{}' $OSG $LOCAL \;
echo " Finding/relocating .pid files..."
find $VDT_LOCATION -name "*.pid" -not -type l -not -path "*o..pacman*" -not -path "*.orig*" -exec $VDT_LOCATION/relocate_file.sh '{}' $OSG $LOCAL \;
# Find some SPECIFIC MonaLisa files which don't match the patter
echo " Finding/relocating ml_env files..."
find $VDT_LOCATION -name "ml_env" -not -type l -not -path "*o..pacman*" -not -path "*.orig*" -exec $VDT_LOCATION/relocate_file.sh '{}' $OSG $LOCAL \;
echo " Finding/relocating site_env files..."
find $VDT_LOCATION -name "site_env" -not -type l -not -path "*o..pacman*" -not -path "*.orig*" -exec $VDT_LOCATION/relocate_file.sh '{}' $OSG $LOCAL \;
relocate_file.sh
#!/bin/bash
#
# This script is meant to be run ONCE after installing OSG into an AFS area
# This script will take inputs for
# 1) File to be relocated
# 2) Basedir of OSG install ($VDT_LOCATION)
# 3) Local redirect directory
# and:
# a) copy file to an equivalent location under a local dir
# b) rename the original to <name>.orig
# c) create a new soft-link from the <name> to its local location
#
# Shawn McKee <smckee@umich.edu>, April 5, 2007
#######################################################
# Inputs
file=$1
basedir=$2
local=$3
# Some manipulations to get correct locations
dest=$local"${file/$basedir/}"
destdir=`dirname $dest`
echo " "
echo " File $file"
#echo " Mkdir: mkdir -p $destdir"
mkdir -p $destdir
#echo " Copy: cp -a $file $destdir"
cp -a $file $destdir
newfile=$file.orig
#echo " Rename: mv $file $newfile"
mv $file $newfile
destfile=$destdir/`basename $file`
#echo " Soft-link: ln -s $destfile $file"
ln -s $destfile $file
To enable future installs I made a complete tar-ball of /opt/OSG060/* for deployment on a future host.
Post-Install Configuration
After making sure the vdt-control woiuld start all needed services we also ran the gums-host-cron manually:
= /afs/atlas.umich.edu/OSG060/gums/bin/gums-host-cron=
Then we were ready to run the configure-osg.sh script. We answered all questions and it completed successfully.
Next we tried to get the authentication working using GUMS and Full Privilege mode. We need to edit the gums-client.properties file to make s ure we used linat02.grid.umich.edu. Next we had to edit the gums.config to allow *.aglt2.org to use this GUMS server (and then linat03 and linat04 as well). Then we could successfully generate a gridmapfile.
We then setup the gsi-authz.conf to use umfs02. When we tried to do a
globusrun -a -r gate01.aglt2.org
as "Shawn McKee" mapped to usatlas3 it failed with:
TIME: Thu Apr 5 13:32:05 2007
PID: 11178 -- Notice: 0: GATEKEEPER_ACCT_FD=4 (/afs/atlas.umich.edu/OSG060/globus/var/accounting.log)
TIME: Thu Apr 5 13:32:05 2007
PID: 11178 -- Notice: 6: Got connection 141.211.43.122 at Thu Apr 5 13:32:05 2007
GSS authentication failure
GSS Major Status: General failure
GSS Minor Status Error Chain:
accept_sec_context.c:gss_accept_sec_context:396:
Error during delegation: Delegation protocol violation
Failure: GSS failed Major:000d0000 Minor:00000001 Token:00000000
I traced the problem to not having reverse lookups working:
[gate01:bin]# nslookup gate01.aglt2.org
Server: 198.108.1.42
Address: 198.108.1.42#53
Non-authoritative answer:
Name: gate01.aglt2.org
Address: 192.41.230.11
That worked but:
[gate01:bin]# nslookup 192.41.230.11
Server: 198.108.1.42
Address: 198.108.1.42#53
** server can't find 11.230.41.192.in-addr.arpa: NXDOMAIN
This didn't.
After getting the PTR records at Merit put in things now work:
[gate02] /afs/atlas.umich.edu/home/smckee > globusrun -a -r gate01.aglt2.org
GRAM Authentication test successful
Then testing as smckee failed in the next step:
[gate02] /afs/atlas.umich.edu/home/smckee > globus-job-run gate01.aglt2.org/jobmanager /usr/bin/id
WARNING: Invalid log file: "/afs/atlas.umich.edu/OSG060/globus/tmp/gram_job_state/gram_condor_log.23951.1175807310" (Permission denied)
GRAM Job failed because the job failed when the job manager attempted to run it (error code 17)
%NOTE% The problem is likely that the $GLOBUS_LOCATION/tmp directory needs to also be redirected. Doing that now.
Retesting and it now works:
(error code 17)
[gate02] /afs/atlas.umich.edu/home/smckee > globus-job-run gate01.aglt2.org/jobmanager /usr/bin/id
uid=789090(usatlas3) gid=55670(usatlas) groups=55670(usatlas)
I also checked
jobmanager-fork
and
jobmanger-condor
and both worked.
The tomcat-5 system is not starting because of another AFS issue: the $VDT_LOCATION/tomcat/v5/logs and ../temp directories must be redirected.
Also the /etc/init.d/tomcat-5 script needs to be modified to put its .lock and .pid files in v5/temp rather than v5. Next we had to also redirect the
conf
and
work
directories. Next we found that $VDT_LOCATION/lcg/var also needed to be redirected.
Seems to start now.
Site Verfiy
I ran site-verify as my DN and things seem OK.
===============================================================================
Info: Site verification initiated at Thu Apr 5 22:28:18 2007 GMT.
===============================================================================
-------------------------------------------------------------------------------
------------ Begin gate01.aglt2.org at Thu Apr 5 22:28:18 2007 GMT -----------
-------------------------------------------------------------------------------
Checking prerequisites needed for testing: PASS
Checking for a valid proxy for smckee@gate01.aglt2.org: PASS
Checking if remote host is reachable: PASS
Checking for a running gatekeeper: YES; port 2119
Checking authentication: PASS
Checking 'Hello, World' application: PASS
Checking remote host uptime: PASS
18:28:24 up 20 days, 5:15, 2 users, load average: 0.00, 0.04, 0.04
Checking remote Internet network services list: PASS
Checking remote Internet servers database configuration: PASS
Checking for GLOBUS_LOCATION: /afs/atlas.umich.edu/OSG060/globus
Checking expiration date of remote host certificate: Apr 1 16:09:18 2008 GMT
Checking for gatekeeper configuration file: YES
/afs/atlas.umich.edu/OSG060/globus/etc/globus-gatekeeper.conf
Checking users in grid-mapfile, if none must be using Prima: compbiogrid,des,dosar,engage,fermilab,fmri,gadu,geant4,glow,gpn,grase,gridex,grow,gugrid,ivdgl,ligo,mariachi,mis,nanohub,nwicg,ops,osg,osgedu,sam,samgrid,sdss,star,usatlas3,usatlas4
Checking for remote globus-sh-tools-vars.sh: YES
Checking configured grid services: PASS
jobmanager,jobmanager-condor,jobmanager-fork,jobmanager-managedfork
Checking for OSG osg-attributes.conf: YES
Checking scheduler types associated with remote jobmanagers: PASS
jobmanager is of type managedfork
jobmanager-condor is of type condor
jobmanager-fork is of type managedfork
jobmanager-managedfork is of type managedfork
Checking for paths to binaries of remote schedulers: PASS
Path to condor binaries is /opt/condor/bin
Path to managedfork binaries is $env/gratia/var/data
Checking remote scheduler status: PASS
condor : 1 jobs running, 0 jobs idle/pending
Checking if Globus is deployed from the VDT: YES; version 1.6.1d
Checking for OSG version: YES; version 0.6.0
Checking for OSG grid3-user-vo-map.txt: YES
ivdgl users: ivdgl
i2u2 users: i2u2
geant4 users: geant4
grow users: grow
osgedu users: osgedu
nanohub users: nanohub
gridex users: gridex
fmri users: fmri
DOSAR users: dosar
osg users: osg
usatlas users: usatlas1,usatlas2,usatlas3,usatlas4
LIGO users: ligo
star users: star
uscms users: uscms02,uscms01
grase users: grase
glow users: glow
fermilab users: fermilab
dzero users: sam,samgrid
mis users: mis
des users: des
sdss users: sdss
gadu users: gadu
Checking for OSG site name: AGLT2
Checking for OSG $GRID3 definition: /afs/atlas.umich.edu/OSG060
Checking for OSG $OSG_GRID definition: /afs/atlas.umich.edu/OSG060
Checking for OSG $APP definition: /atlas/data08/OSG/APP
Checking for OSG $DATA definition: /atlas/data08/OSG/DATA
Checking for OSG $TMP definition: /atlas/data08/OSG/DATA
Checking for OSG $WNTMP definition: /tmp
Checking for OSG $OSG_GRID existence: PASS
Checking for OSG $APP existence: PASS
Checking for OSG $DATA existence: PASS
Checking for OSG $TMP existence: PASS
Checking for OSG $APP writability: PASS
Checking for OSG $DATA writability: PASS
Checking for OSG $TMP writability: PASS
Checking for OSG $APP available space: 227.954 GB
Checking for OSG $DATA available space: 227.954 GB
Checking for OSG $TMP available space: 227.954 GB
Checking for OSG additional site-specific variable definitions: YES
<No Location List Name>
ATLAS_APP prod /atlas/data08/OSG/APP/atlas_app
ATLAS_DATA prod /atlas/data08/OSG/DATA/atlas_data
ATLAS_DQ2Cli prod /atlas/data08/OSG/DATA/atlas_app/dq2_cli/DQ2Cli
ATLAS_LOC_1103 11.0.3 /atlas/data08/OSG/APP/atlas_app/atlas_rel/11.0.3
ATLAS_LOC_11042 11.0.42 /atlas/data08/OSG/APP/atlas_app/atlas_rel/11.0.42
ATLAS_LOC_1105 11.0.5 /atlas/data08/OSG/APP/atlas_app/atlas_rel/11.0.5
ATLAS_LOC_1201 12.0.1 /atlas/data08/OSG/APP/atlas_app/atlas_rel/12.0.1
ATLAS_LOC_1202 12.0.2 /atlas/data08/OSG/APP/atlas_app/atlas_rel/12.0.2
ATLAS_LOC_1203 12.0.3 /atlas/data08/OSG/APP/atlas_app/atlas_rel/12.0.3
ATLAS_LOC_1204 12.0.4 /atlas/data08/OSG/APP/atlas_app/atlas_rel/12.0.4
ATLAS_LOC_1205 12.0.5 /atlas/data08/OSG/APP/atlas_app/atlas_rel/12.0.5
ATLAS_LOC_1206 12.0.6 /atlas/data08/OSG/APP/atlas_app/atlas_rel/12.0.6
ATLAS_LOC_1230 12.3.0 /atlas/data08/OSG/APP/atlas_app/atlas_rel/12.3.0
ATLAS_LOC_GCC 3.2 /atlas/data08/OSG/APP/atlas_app/gcc32
ATLAS_LOC_GCE prod /atlas/data08/OSG/APP/atlas_app/GCE-Server/gce-server
ATLAS_LOC_KitVal prod /atlas/data08/OSG/APP/atlas_app/atlas_rel/kitval/KitValidation
ATLAS_LOC_Trfs prod /atlas/data08/OSG/APP/atlas_app/Atlas-Trfs/atlas-trfs
ATLAS_PYTHONHOME prod /atlas/data08/OSG/DATA/atlas_app/python
ATLAS_STAGE prod /atlas/data08/OSG/DATA/atlas_data
Checking for OSG execution jobmanager(s): gate01.aglt2.org/jobmanager-condor
Checking for OSG utility jobmanager(s): gate01.aglt2.org/jobmanager
Checking for OSG sponsoring VO: usatlas:80 local:20
Checking for OSG policy expression: NONE
Checking for OSG setup.sh: YES
Checking for OSG $Monalisa_HOME definition: /afs/atlas.umich.edu/OSG060/MonaLisa
Checking for MonALISA configuration: PASS
key ml_env vars:
FARM_NAME = AGLT2
FARM_HOME = /afs/atlas.umich.edu/OSG060/MonaLisa/Service/VDTFarm
FARM_CONF_FILE = /afs/atlas.umich.edu/OSG060/MonaLisa/Service/VDTFarm/vdtFarm.conf
SHOULD_UPDATE = false
URL_LIST_UPDATE = http://monalisa.cacr.caltech.edu/FARM_ML,http://monalisa.cern.ch/MONALISA/FARM_ML
key ml_properties vars:
lia.Monitor.group = OSG
lia.Monitor.useIPaddress = undef
MonaLisa.ContactEmail = smckee@umich.edu
Checking for a running MonALISA: PASS
MonALISA is ALIVE (pid 3843)
MonALISA_Version = 1.6.8-200611241031
MonALISA_VDate = 2006-11-24
VoModulesDir = VoModules-v0.36
tcpServer_Port = 9002
storeType = epgsqldb
Checking for a running GANGLIA gmond daemon: PASS (pid 32215 ...)
/opt/ganglia/sbin/gmond
name "UMOR"
owner "UM ATLAS Physics"
url "http://umopt1.grid.umich.edu/"
Checking for a running GANGLIA gmetad daemon: PASS (pid 3324 ...)
/usr/sbin/gmetad
trusted_hosts 127.0.0.1 141.211.43.112 10.10.1.1
Checking for a running gsiftp server: YES; port 2811
Checking gsiftp (local client, local host -> remote host): PASS
Checking gsiftp (local client, remote host -> local host): PASS
Checking that no differences exist between gsiftp'd files: PASS
Checking for VDS existence: PASS
Checking for VDS kickstart existence: PASS
Checking for VDS k.2 (OPTIONAL) existence: PASS
Checking for VDS dirmanager existence: PASS
Checking for VDS invoke existence: PASS
Checking for VDS transfer existence: PASS
Checking for VDS T2 (OPTIONAL) existence: PASS
Checking for VDS seqexec existence: PASS
Checking for VDS mpiexec (OPTIONAL) existence: FAIL
-------------------------------------------------------------------------------
------------- End gate01.aglt2.org at Thu Apr 5 22:33:55 2007 GMT ------------
-------------------------------------------------------------------------------
===============================================================================
Info: Site verification completed at Thu Apr 5 22:33:55 2007 GMT.
Next I registered with OSG.
--
ShawnMcKee - 02 Apr 2007