Grid Certificate Distribution at AGLT2
The certificates in /etc/grid-security/certificates are used by the OSG authentication stack. It is a regularly updated, standard set of files, an includes the details of revoked certificates. So as to avoid all hosts doing regular updates over the Internet, we instead have a central machine (gate01.aglt2.org) pull all updates, including rpm changes, into its own directories. This in turn is written to a rw afs volume, which is then released to all ro copies. All other client machines then pull their directory copy via rsync into their own certificates directory.
Details of the Several Steps
- Every 6 hours at h:0 a check is made on gate01.aglt2.org for a new rpm of certificates
- /etc/cron.d/osg-ca-certs-updater from the OSG distribution
- At h:50 on gate01.aglt2.org fetch-crl is run via cron
- /etc/cron.d/fetch-crl, set up by the OSG rpm, probably modified by AGLT2 to run at this time
- At h:14 on gate01.aglt2.org the full directory is rsync'd into /afs/.atlas.umich.edu/OSG_certificates, the rw volume of this set
- /etc/cron.d/rsync-certificates-into-afs that runs /root/tools/rsync-certificates-into-afs.sh
- At h:41 on linat06, the home of the OSG_certificates rw volume, the rw volume is released to the ro copies
- /etc/cron.d/release_Certificates
- Keeps an accumulating log in /var/log/afs_release_Certificates.log
- In the interval h:25 to h:40, all but the various dCache machines rsync their certificates directory out of afs
- /etc/cron.d/rsync-certificates.cron, which runs /root/tools/rsync-certificates.sh
- At h:20 the various dCache macines rsync their certificates directory out of afs
- Same cron task, same tools file, just different time
Probably this cycle time could be shortened by changing the time the rw volume is released on linat06 to h:15, and the gate01 copy into afs to h:10. Such a change could accomplish a full certificate distribution in about 50 minutes or so.
Request and Update host certificates
In order to request a host certificate, one should first use the osg tools to generate the certificate request file, then either go to the UM
WasUP page or MSU
service portal to make the request with the certificate request file.
Generate the host request file
There is a script to generate the request file:
/atlas/data08/manage/cluster/hostcert_request/hostcert_req.sh hostname [alternative_hostname]
#!/bin/bash
ext=""
if [ $# -gt 0 ];then
hn=$1;
shift
else
echo $0 hostname [alternative_hostname]
exit 1
fi
if [ $# -gt 0 ];then
while [ $# -gt 0 ]
do
alt_hn=$1
shift
ext=$ext"--altname $alt_hn "
done
fi
cmd="osg-cert-request --hostname $hn --country US --state Michigan --locality 'Ann Arbor' --organization 'University of Michigan' "$ext
eval $cmd
Note: The script needs to be run on
gate01.aglt2.org, where the osg tools are installed.. Details of the OSG tools can be checked
here
The output are 2 files, for example, for the host aglbatch.aglt2.org, the output files are
aglbatch.aglt2.org-key.pem and
aglbatch.aglt2.org-key.req
copy the content of the aglbatch.aglt2.org-key.req file, and use that to go to the UM
WasUP page to request for the new host certificate.
Request the host cert (Incommon certificate issued by IGTF server)
From UM
WasUP page, filling the forms there, and submit the request.Every year they require the domain name (aglt2.org) validation. For that, they sent us a CNAME entry, and we publish it to merit. An alternavie option is to publish a text file on our www.aglt2.org server:
http://aglt2.org/.well-known/pki-validation/328504995F0D2A2D58FE5D271D3E5594.txt
When requsting the hostcert, in the comment area, indicating
the host cert needs to be issued by InCommon IGTF Server CA, and the maximum period of validity is 395 days. Otherwise the issuer by default is
InCommon RSA Server CA..
Once the host cert if approved, the requested get an email, then go back to the UM
WasUP page, and on the right pannel of the page, click on the "settings" of host name that you made the request, it will display your request, and the content of the host cerficate file.
For RSA certs via MSU, use
Self Service Portal. login with your msunet id (without .msu.edu) and password. select type Apache/ModSSL and SSL Certificate. After submitting, you will get 2 email about opening and closing the ticket, 1 about the submission to
InCommon, 1 from
InCommon awaiting approval, 1 when it is approved, and 1 with links to the certificate files. We use the link for the Certificate only, PEM encoded file.
For IGTF certs via MSU, contact Ryan Lewis <lewisry2@msu.edu>. Then offer to email a text file with all the machines name and matching certificate request, one set after the other, all in one long text file. He has been happy ingesting them that way. You will then get the similar 3 emails from
InCommon for each cert.
Download the host cert and place it on the central location
Copy the content of the host cerificate file, and paste to a new file named as
/atlas/data08/manage/cluster/hostcert_request/aglbatch.aglt2.org.pem
verify the host cert:
openssl x509 -noout -enddate -subject -in aglbatch.aglt2.org
copy both files to the central place where we put the host certicate (agltbatch.aglt2.org:/root/hostcert)
/atlas/data08/manage/cluster/hostcert_request/aglbatch.aglt2.org.pem
/atlas/data08/manage/cluster/hostcert_request/aglbatch.aglt2.org-key.pem
Update host certicate via cfengine
All the host certificates and user certificates for services are stored centrally in
aglbatch:/root/hostcert,
any updates of any of these certificates should be placed here first, and then to be distributed to cfengine server.
1) on aglbatch, create a tar ball of the /root/hostcert directory
cd /root; tar czvf hostcert.tar.gz hostcert
2) Copy agltbatch:/root/hostcert.tar.gz to the cfengine servers (umcfe and msucfe)
3) Repeat the following steps on umcfe and msucfe
cd /var/cfengine/policy/T2/stash
tar xzvf /root/hostcert.tar.gz -C .
4) update the host cert on the host
cf-agent -Kf failsafe.cf;cf-agent -K -b hostcert
5) Verify that the host cert is update
[root@aglbatch grid-security]# openssl x509 -in hostcert.pem -noout -subject -enddate
subject= /C=US/postalCode=48109/ST=MI/L=Ann Arbor/street=530 S. State St./O=University of Michigan/OU=Information Technology Services/CN=aglbatch.aglt2.org
notAfter=Apr 30 23:59:59 2021 GMT
User certifcate for services
a valid user certificate with atlas vo production role is being used for various services. Currently we use Wenjing's user certificate, and it is placed in
aglbatch:/root/hostcert.
The user cert and key pair are renamed as:
xrootd_usercert.pem
xrootd_userkey.pem
Also,
xrootd_scrt stores the password of the private key.
They are distributed to the servers as described above.
A list of servers which uses this user certificate/key pair include:
head02:/root/.globus
gate02:/var/rsv/.globus
dcdum01:/var/lib/dcache/.globus
dcdmsu:/var/lib/dcache/.globus
Note: Please update user certificate in the above places when there is a renewal of the user certificate.
--
BobBall - 04 Oct 2018