Backing up and moving VMs If the VM is running you need to pull a snapshot and backup, otherwise the .vmdk may not be consistent. Spaces in VM names for some back...
Management of Dcache main services to maintain head01 : dcache core head02: postgresql pnfs dcache core pool nodes: dcache core dcache pool main configurati...
Cross Kickstarting in Rocks 4.3 So you want to cross Kickstart nodes that aren't the same architecture as your front end? Don't worry, rocks can do that, or it's ...
Directory /home/install/tools under SVN control. Directory /home/install/tools/bin intended for adding to PATH as desired. Main.TomRockwell 29 May 2009
Procedure for Installing or Upgrading dCache Servers Procedure for installing dCache servers. This is tested for use on the dCache storage nodes / gridftp doors....
Removing PNFS (Chimera) Ghosts There is the possibility that the chimera DB can become out of sync with the actual files stored on disk. The t_dirs table holds t...
Restarting the MSU OSG Grid How to restart the system after an outage. Bring Up and Check Services Cluster Services General cluster services are required, for in...
Background Info: Unexpected Power Loss on file servers During backup generator test on 12 may 09 at the MSU BPS bldg , most UPSs received an errant EPO (Emergenc...
Upgrade Postgres on AGLT2 dCache ADMIN node During our recent upgrade of dCache we also ended up upgrading our Postgres installation on the dCache PNFS node (head...
(Re)Configuration of gPlazma on AGLT2 Due to issues with SRM failing that were traced to probable issues in gPlazma we are planning to implement some changes to g...
AGLT2 SRM Hangs Starting in late April 2009 AGLT2 was having more and more dCache/SRM issues. One problem that significantly increased in frequency was SRM faili...
Reconfigure dCacheConfig to use both Cell and Module As noted before, the preferred gPlazma configuration uses the cell method but it has been pointed out that th...
Upgading dCache Starting on the morning of May 4th 2009 AGLT2 began upgrading from dCache 1.8.0 15p12 to 1.9.2.5 as well as migrating to Chimera. This was motiva...
Replicating the Oracle Controlfile For safety it is good to replicate the controlfile for an Oracle DB. Our muoncal instance (after reinstalling in May 2009)...
Reinstallation of Oracle for the Muon Calibration Center On May 2nd, 2009 our primary Oracle server (umors.grid.umich.edu) was compromised because we had forgott...
How to convert a RO volume to a RW volume in AFS 1) Get tokens as 'admin' via 'kinit admin' folllowed by 'aklog' 2) Check the mount: 'fs lsmount /afs/atlas.umi...
Setup Atlas Space Token background details about why space revervation is needed, refer to srm space reservationdcache book. steps to set up space reservation ...
We have a bunch of scripts which relies on the space token information, therefore,I define 2 hash based subroutine in the dcache perl Library, everytime you add/r...
Generic Kickstart Want to perform a network install of a node that won't be a ROCKS client but using the ROCKS frontend as the kickstart server. Have tried and f...
Want to do test installs nodes in a VMWare ESXi guest. Expect that more things can be made to work similarly to an install on a physical host, but expect that th...
Dcache TroubleShooting Case 1.. node c 104 2 has some broken files which mean the md5sum of local file doesnt match the md5sum of source file.. we decided to rec...
Managing the ROCKS Installer with Subversion See local subversion pages at Subversion Creating a Branch or Tag SVN root@msurox /home/install # svn copy m "c...
Here is from an email: In dCacheSetup (in Aglt2, it is poolSetup)you need to define: metaDataRepositoryImport=org.dcache.pool.repository.meta.file.FileMetaDataRe...
MSU Hummm... Normal procedure is to plug keyboard/monitor into node and see if there are any kernel messages on screen. On Dells also note errors from LCD. U M ...
MSU Each rack has 2 PDUs named PDU RACKNUM N.msulocal where N is 1 or 2. For racks with UPSs, the 1 PDU is on the UPS. You can connect to the web interface usin...
MSU Two switches are at msu sw1.local and msu sw2.local. To find a given node's switch ports(s): Option 1 access the switch web interface and browse for the port ...
What To Do after losing the dcache partions of a Node? Due to the rocks rebuilding failures. dcache partitions could be wiped during the rebuilding, thus we have ...
SEC.pl (System Event Correlator) There is a nice two part article on SEC which describes how it works and what it provides. I encourage you to look it over. Th...
Lustre At Aglt2 Lustre Deployment MDS(metadata Server) we have a failover pair of metadata servers,lmd01 and lmd02, both servers can access the same device (/...
Monitoring D0 Jobs Samgrid monitoring is at http://samgrid.fnal.gov:8080/ The list of resent jobs for the samgrid scheduler that is used for MSU jobs is here. Jo...
Testing That OSG Site is Functional Central tests OSG centrally tests all sites a few time a day. * List of all sites * Current result for MSU OSG * C...
Provisioning the Dell 2950/MD1000 Storage Servers Recipe for getting a new PE2950 and MD1000 combination going as an fss nfs appliance in our ROCKS cluster. Refer...
How to Vacuum Postgresql DB sometimes, when too many deletion or updating opereations happened to the postgresql database, it would run into a transactionid probl...
ROCKS Kickstart XML Style Guide and How To The kickstart XML and the accompanying scripts (extras directory) specify much of the configuration for nodes on our cl...
See https://metalink.oracle.com/metalink/plsql/f?p=130:14:3681850522787148914::::p14_database_id,p14_docid,p14_show_header,p14_show_help,p14_black_frame,p14_font:...
How to submit test jobs in panda from umt3 interactive nodes 1 get permission to access BNL CVS BNL CVS is a mirror to Cern CVS, it is a readonly CVS, you can us...
How To rename or RM pnfs dir Rename Just do the rename as the 'root' user. You will get the error message "mv: cannot move `xxx' to `yyy': Stale NFS fil...
OSG Account Setup To support OSG VOs we must setup UNIX user/group accounts for the VOs. We have done this for the AGLT2 sites (UMATLAS and AGLT2_UM) and have o...
02/29/2008 Dimm 2 alerting. Cleared alert. Has not returned as of 03/03/2008. Replaced anyways. First shipment of a single DIMM did not work (wrong brand to ...
ROCKS MySQL Database ROCKS stores configuration information in a MySQL database. Normal operations on configuration are performed with the rocks command, but thi...
Procedure for rebuilding a compute node In general, compute node rebuilding is fairly easy and the ROCKS should be maintained so that compute nodes can be rebuild...
Using RCS in ROCKS and How ROCKS Uses RCS ROCKS uses RCS on files that are written or appended with the file tag in kickstart xml. This provides some possibili...
Security Planning Config Changes to Tighten Security Ideas from June 26 meeting: * Firewall changes, see SystemInstallChecklist * See below...implement...