See also:
OSG Setup at MSU for DZero VO
Will run an OSG CE and SE at MSU for support of DZero VO and processing on 20 8 way Dells and 37 4 way Operon nodes (about 300 cores).
CE node is named msu-osg.aglt2.org, currently it is installed in VMWare, depending on performance (network performance seems suspect), I may need to be migrated to
a native install. msu3.aglt2.org, a dual-dual Dell with a 4x 750GB RAID array (about 2TB available), will be used for the SE and for shared installation of OSG and VO user's home areas.
Will use the Condor batch system.
Certs
Put hostkey.pem and hostcert.pem into /etc/grid-security on msu-osg and msu3.
Condor
Cluster has condor-6.9.4-1 rpm built at Wisc installed. All files from this rpm go into /ot/condor-6.9.4. We also make a link /opt/condor to point to the current install. Msu-osg has this setup...
The main config file is /opt/condor/etc/condor_config. This has a directive to specify a machine specific config file:
LOCAL_CONFIG_FILE = /opt/condor/etc/condor_config.local
passwd in yp specifies condor's home as /afs/atlas.umich.edu/opt/condor, but this seems to be outdated...
For the DZero nodes and msu-osg have:
COLLECTOR_NAME = MSU_D0 at msu-osg
Note: Java directives are broken. Our config file doesn't have some of the newer quill and vm related directives in it.
Disk Areas
Will make a /msu directory and put indirect automount maps in there.
Will start out with an NFS/Autofs setup for everything.
/msu/opt will have osg (vdt) and condor. These will be on msu-osg.
/msu/osg will have app, data, tmp, wn. These will be on msu3.
/msu/data will have area(s) for samgrid. This is on msu3.
Might add an /msu/home map. (How does condor use /home/condor?)
This needs to be integrated with 411. auto.master is in 411, so can't just manually push new maps to compute nodes. Need to see if it is safe to push msu maps to all nodes (at MSU). What happens if /msu doesn't exist. Otherwise a more complex setup using the groups feature in 411 could be used.
The behavior of autofs is fine. /msu is automatically created as a normal directory if it doesn't exist. This will be fine on non-msu nodes.
On msu-osg
On msu-osg, have vdt and condor installed in the area /data. There are automounted to /msu/opt
I'm unsure what is needed to from the CE headnode to make the SE work.
I'm unsure if a shared /home/condor area is needed? This could go under /msu.
On msu3
Have the node msu3 setup to act as data server (SE). It is configured with 0.5TB and 1.5TB partitions exported as /exports/misc and /exports/data.
root@msu3 ~# cat /etc/auto.master
# $411id: /etc/auto/master$
# Retrieved: 24-Jan-2008 12:42
# Master server: 10.10.2.15
# Last modified on master: 12-Dec-2007 00:27
# Encrypted file size: 425 bytes
#
# Owner: 0.0
# Name: /etc/auto.master
# Mode: 0100444
/home /etc/auto.home --timeout=1200
/atlas /etc/auto.atlas --timeout=1200
/msu/data /etc/auto.msu-data
root@msu3 ~# cat /etc/auto.msu-data
dzero msu3.local:/exports/data/dzero-osg
This produces the directory /msu/data/dzero.
Adding to 411
Note! msu-osg is not in 411 and the autofs tables are manually installed there.
Make the tables auto.msu-opt auto.msu-osg auto.msu-data and also add them to auto.master on msurox. To add to 411, do
cd /var/411; make clean; make
To activate on a compute node, do
411get --all; service autofs reload
Site Config Table
Compare to setup of AGLT2 given here
http://vors.grid.iu.edu/cgi-bin/index.cgi?region=1&VO=6&grid=1&dtype=0&res=305#AGLT2
Item |
Value AGLT2 |
Value MSU-OSG |
Site Name |
AGLT2 |
MSU-OSG |
Gatekeeper Address |
gate01.aglt2.org |
msu-osg.aglt2.org |
Gatekeeper Port |
2119 |
auto |
Globus Location |
/opt/OSG080/globus |
/msu/opt/osg/globus |
Host Cert Expiration Date |
Apr 1 16:09:18 2008 GMT |
auto |
Gatekeeper Config Location |
/opt/OSG080/globus/etc/globus-gatekeeper.conf |
/msu/opt/osg/globus/etc/globus-gatekeeper.conf |
GSIFTP Port |
2811 |
auto |
Path to Condor Binaries |
/opt/condor/bin |
/msu/opt/condor/bin |
VDT Version |
1.8.1g |
auto |
VDT Location |
/opt/OSG080 |
/msu/opt/osg |
$APP Location |
/atlas/data08/OSG/APP |
/msu/osg/app |
$DATA Location |
/atlas/data08/OSG/DATA |
/msu/osg/data |
$TMP Location |
/atlas/data08/OSG/DATA |
/msu/osg/tmp |
$WNTMP Location |
/tmp |
/scratch |
$OSG_GRID Location |
/afs/atlas.umich.edu/OSGWN |
same |
$APP Space Available |
3538.807 GB |
TBD |
$DATA Space Available |
3538.807 GB |
TBD |
$TMP Space Available |
3538.807 GB |
TBD |
Execution Job Manager |
gate01.aglt2.org/jobmanager-condor |
msu-osg.aglt2.org/jobmanager-condor |
Utility Job Manager |
gate01.aglt2.org/jobmanager |
msu-osg.aglt2.org/jobmanager |
Sponsoring VO |
usatlas:80 local:20 |
dzero:100 |
Policy |
NONE |
https://hep.pa.msu.edu/twiki/bin/view/AGLT2/MsuOsgPolicy |
Install
Pacman
unpack tar ball and copy contents to /msu/opt/pacman. Cd there and source setup.sh (this set writes the setup.sh script).
OSG - VDT
Tell installer that condor is external:
export VDTSETUP_CONDOR_LOCATION=/opt/condor-6.9.4/
oot@msu-osg osg]#
[root@msu-osg osg]#
[root@msu-osg osg]# time pacman -get OSG:ce
Do you want to add [http://software.grid.iu.edu/pacman/] to [trusted.caches]? (y or n): y
Package [ce] found in [OSG]...
Package [OSG:osg-auto-0.8.0] found in [OSG]...
Package [OSG:vo-package] found in [OSG]...
Package [osg-auto-0.8.0] found in [OSG]...
Do you want to add [http://vdt.cs.wisc.edu/vdt_181_cache] to [trusted.caches]? (y or n): y
Package [OSG-CE] found in [http://vdt.cs.wisc.edu/vdt_181_cache]...
Package [http://vdt.cs.wisc.edu/vdt_181_cache:VDT-Common] found in [http://vdt.cs.wisc.edu/vdt_181_cache]...
Package [http://vdt.cs.wisc.edu/vdt_181_cache:VDT-Core] found in [http://vdt.cs.wisc.edu/vdt_181_cache]...
.
.
.
Merge it manually with the new /msu/opt/osg/edg/etc/edg-mkgridmap.conf if you had a special edg-mkgridmap.conf
Pacman Installation of OSG-0.8.0 Complete
real 18m50.411s
user 6m2.381s
sys 5m8.925s
Now the Globus-Condor-Setup... This seems to pull a full condor install down and put it at /msu/opt/osg/condor-devel/, even though instructions say it won't...
cd $VDT_LOCATION
source setup.sh
export VDTSETUP_CONDOR_LOCATION=/opt/condor-6.9.4
pacman -get OSG:Globus-Condor-Setup
Configure
run $VDT_LOCATION/monitoring/configure-osg.sh
Unresolved issues: want to know hardware details of compute nodes --- need to tell it about XEON and Opteron nodes separately? Wants to know about Storage Element, which isn't setup...
Authorization
Could use GUMS host at U-M, but will try simple local auth (grid-mapfile) for now (should be fine for SAMGRID).
Edited $VDT_LOCATION/edg/etc/edg-mkgridmap.conf to only allow dzero and osg VOs.
[root@msu-osg osg]# cd /msu/opt/osg/edg/sbin/
[root@msu-osg sbin]# ./edg-mkgridmap --output=test.out
WARNING: Could not locate /msu/opt/osg/monitoring/gip-attributes.conf.
[root@msu-osg sbin]# grep -i rockwell test.out
"/DC=gov/DC=fnal/O=Fermilab/OU=People/CN=Thomas D. Rockwell/USERID=rockwell" samgrid
"/DC=org/DC=doegrids/OU=People/CN=Thomas D. Rockwell 123787" samgrid
"/DC=org/DC=doegrids/OU=People/CN=Thomas D. Rockwell 611410" samgrid
Ok, now run command with no options to create grid-mapfile in /etc/grid-security.
sudo for ws-gram
Added config to /etc/sudo as given in $VDT_LOCATION/post-install/README
Validation of Install
Start services
[root@msu-osg ~]# vdt-control --list
Service | Type | Desired State
-------------------+--------+--------------
fetch-crl | cron | enable
vdt-rotate-logs | cron | enable
gris | init | do not enable
globus-gatekeeper | inetd | enable
gsiftp | inetd | enable
mysql | init | enable
globus-ws | init | enable
edg-mkgridmap | cron | do not enable
gums-host-cron | cron | do not enable
MLD | init | do not enable
vdt-update-certs | cron | do not enable
condor-devel | init | enable
apache | init | enable
osg-rsv | init | do not enable
tomcat-5 | init | enable
syslog-ng | init | do not enable
gratia-condor | cron | enable
[root@msu-osg ~]# vdt-control --on
enabling cron service fetch-crl... no crontab for root
ok
enabling cron service vdt-rotate-logs... ok
skipping init service 'gris' -- marked as disabled
enabling inetd service globus-gatekeeper... ok
enabling inetd service gsiftp... ok
enabling init service mysql... ok
enabling init service globus-ws... ok
skipping cron service 'edg-mkgridmap' -- marked as disabled
skipping cron service 'gums-host-cron' -- marked as disabled
skipping init service 'MLD' -- marked as disabled
skipping cron service 'vdt-update-certs' -- marked as disabled
enabling init service condor-devel... ok
enabling init service apache... FAILED! (see vdt-install.log)
skipping init service 'osg-rsv' -- marked as disabled
enabling init service tomcat-5... ok
skipping init service 'syslog-ng' -- marked as disabled
enabling cron service gratia-condor... ok
As a user that is allowed via grid-mapfile, get a proxy and then do a simple test
[rockwell@msu-osg ~]$ grid-proxy-info
subject : /DC=org/DC=doegrids/OU=People/CN=Thomas D. Rockwell 611410/CN=1238583866
issuer : /DC=org/DC=doegrids/OU=People/CN=Thomas D. Rockwell 611410
identity : /DC=org/DC=doegrids/OU=People/CN=Thomas D. Rockwell 611410
type : Proxy draft (pre-RFC) compliant impersonation proxy
strength : 512 bits
path : /tmp/x509up_u901
timeleft : 355:36:29 (14.8 days)
[rockwell@msu-osg ~]$ time globus-job-run msu-osg.aglt2.org/jobmanager-fork /usr/bin/id
uid=825664(samgrid) gid=55673(dzero) groups=55673(dzero)
real 0m3.674s
user 0m0.205s
sys 0m0.325s
Another, here is the environment:
[rockwell@msu-osg ~]$ time globus-job-run msu-osg.aglt2.org/jobmanager-fork /usr
/bin/printenv
HOME=/msu/osg/home/samgrid
OSG_DATA=/msu/osg/data
LD_LIBRARY_PATH=/msu/opt/osg/apache/lib:/msu/opt/osg/MonaLisa/Service/VDTFarm/pg
sql/lib:/msu/opt/osg/glite/lib:/msu/opt/osg/prima/lib:/msu/opt/osg/mysql/lib/mys
ql:/msu/opt/osg/jdk1.5/jre/lib/i386:/msu/opt/osg/jdk1.5/jre/lib/i386/server:/msu
/opt/osg/jdk1.5/jre/lib/i386/client:/msu/opt/osg/berkeley-db/lib:/msu/opt/osg/ex
pat/lib:/msu/opt/osg/globus/lib:/msu/opt/osg/apache/lib:/msu/opt/osg/MonaLisa/Se
rvice/VDTFarm/pgsql/lib:/msu/opt/osg/glite/lib:/msu/opt/osg/prima/lib:/msu/opt/o
sg/jdk1.5/jre/lib/i386:/msu/opt/osg/jdk1.5/jre/lib/i386/server:/msu/opt/osg/jdk1
.5/jre/lib/i386/client:/msu/opt/osg/mysql/lib/mysql:/msu/opt/osg/berkeley-db/lib
:/msu/opt/osg/expat/lib::/msu/opt/osg/apache/lib:/msu/opt/osg/MonaLisa/Service/V
DTFarm/pgsql/lib:/msu/opt/osg/glite/lib:/msu/opt/osg/prima/lib:/msu/opt/osg/mysq
l/lib/mysql:/msu/opt/osg/jdk1.5/jre/lib/i386:/msu/opt/osg/jdk1.5/jre/lib/i386/se
rver:/msu/opt/osg/jdk1.5/jre/lib/i386/client:/msu/opt/osg/berkeley-db/lib:/msu/o
pt/osg/expat/lib:/msu/opt/osg/globus/lib:/msu/opt/osg/apache/lib:/msu/opt/osg/Mo
naLisa/Service/VDTFarm/pgsql/lib:/msu/opt/osg/glite/lib:/msu/opt/osg/prima/lib:/
msu/opt/osg/mysql/lib/mysql:/msu/opt/osg/jdk1.5/jre/lib/i386:/msu/opt/osg/jdk1.5
/jre/lib/i386/server:/msu/opt/osg/jdk1.5/jre/lib/i386/client:/msu/opt/osg/berkel
ey-db/lib:/msu/opt/osg/expat/lib:/msu/opt/osg/apache/lib:/msu/opt/osg/MonaLisa/S
ervice/VDTFarm/pgsql/lib:/msu/opt/osg/glite/lib:/msu/opt/osg/prima/lib:/msu/opt/
osg/jdk1.5/jre/lib/i386:/msu/opt/osg/jdk1.5/jre/lib/i386/server:/msu/opt/osg/jdk
1.5/jre/lib/i386/client:/msu/opt/osg/mysql/lib/mysql:/msu/opt/osg/berkeley-db/li
b:/msu/opt/osg/expat/lib::/msu/opt/osg/globus/lib
GRID3_TMP_DIR=/msu/osg/data
GRID3_TMP_WN_DIR=/scratch
OSG_GLEXEC_LOCATION=UNAVAILABLE
OSG_LOCATION=/msu/opt/osg
OSG_HOSTNAME=msu-osg.aglt2.org
OSG_STORAGE_ELEMENT=n
OSG_APP=/msu/osg/app
OSG_JOB_CONTACT=msu-osg.aglt2.org/jobmanager-condor
GRID3_SITE_NAME=MSU-OSG
GRID3_BASE_DIR=/msu/opt/osg
GRID3_DATA_DIR=/msu/osg/data
OSG_DEFAULT_SE=UNAVAILABLE
OSG_GRID=/msu/opt/osgwn
OSG_SQUID_LOCATION=UNAVAILABLE
LOGNAME=samgrid
GLOBUS_LOCATION=/msu/opt/osg/globus
GLOBUS_GRAM_JOB_CONTACT=https://msu-osg.aglt2.org:33559/16448/1201665486/
OSG_SITE_NAME=MSU-OSG
OSG_SITE_WRITE=UNAVAILABLE
GLOBUS_GRAM_MYJOB_CONTACT=URLx-nexus://msu-osg.aglt2.org:33560/
PATH=/msu/opt/osg/apache/bin:/msu/opt/osg/srm-v2-client/bin:/msu/opt/osg/srm-v1-
client/sbin:/msu/opt/osg/srm-v1-client/bin:/msu/opt/osg/wget/bin:/msu/opt/osg/gu
ms/scripts:/msu/opt/osg/cert-scripts/bin:/msu/opt/osg/glite/sbin:/msu/opt/osg/gl
ite/bin:/msu/opt/osg/edg/sbin:/msu/opt/osg/prima/bin:/msu/opt/osg/mysql/bin:/msu
/opt/osg/logrotate/sbin:/msu/opt/osg/ant/bin:/msu/opt/osg/jdk1.5/bin:/opt/condor
-6.9.4//sbin:/opt/condor-6.9.4//bin:/msu/opt/osg/gpt/sbin:/msu/opt/osg/globus/bi
n:/msu/opt/osg/globus/sbin:/msu/opt/pacman/bin:/msu/opt/osg/vdt/sbin:/msu/opt/os
g/vdt/bin:/msu/opt/osg/apache/bin:/msu/opt/osg/srm-v2-client/bin:/msu/opt/osg/sr
m-v1-client/sbin:/msu/opt/osg/srm-v1-client/bin:/msu/opt/osg/wget/bin:/msu/opt/o
sg/gums/scripts:/msu/opt/osg/cert-scripts/bin:/msu/opt/osg/glite/sbin:/msu/opt/o
sg/glite/bin:/msu/opt/osg/edg/sbin:/msu/opt/osg/prima/bin:/msu/opt/osg/mysql/bin
:/msu/opt/osg/logrotate/sbin:/msu/opt/osg/ant/bin:/msu/opt/osg/jdk1.5/bin:/opt/c
ondor-6.9.4//sbin:/opt/condor-6.9.4//bin:/msu/opt/osg/gpt/sbin:/msu/opt/pacman/b
in:/msu/opt/osg/vdt/sbin:/msu/opt/osg/vdt/bin:/sbin:/usr/sbin:/bin:/usr/bin:/usr
/X11R6/bin
PERL5LIB=/msu/opt/osg/vdt/lib:/msu/opt/osg/perl/lib/5.8.0:/msu/opt/osg/perl/lib/
5.8.0/x86_64-linux-thread-multi:/msu/opt/osg/perl/lib/site_perl/5.8.0:/msu/opt/o
sg/perl/lib/site_perl/5.8.0/x86_64-linux-thread-multi:/msu/opt/osg/vdt/lib:/msu/
opt/osg/perl/lib/5.8.0:/msu/opt/osg/perl/lib/5.8.0/x86_64-linux-thread-multi:/ms
u/opt/osg/perl/lib/site_perl/5.8.0:/msu/opt/osg/perl/lib/site_perl/5.8.0/x86_64-
linux-thread-multi:/msu/opt/osg/vdt/lib:/msu/opt/osg/perl/lib/5.8.0:/msu/opt/osg
/perl/lib/5.8.0/x86_64-linux-thread-multi:/msu/opt/osg/perl/lib/site_perl/5.8.0:
/msu/opt/osg/perl/lib/site_perl/5.8.0/x86_64-linux-thread-multi:
GRID3_APP_DIR=/msu/osg/app
OSG_WN_TMP=/scratch
OSG_SITE_READ=UNAVAILABLE
X509_USER_PROXY=/msu/osg/home/samgrid/.globus/job/msu-osg.aglt2.org/16448.120166
5486/x509_up
real 0m3.616s
user 0m0.202s
sys 0m0.352s
--
TomRockwell - 23 Jan 2008