Grid Certificate Distribution at AGLT2

The certificates in /etc/grid-security/certificates are used by the OSG authentication stack. It is a regularly updated, standard set of files, an includes the details of revoked certificates. So as to avoid all hosts doing regular updates over the Internet, we instead have a central machine (gate01.aglt2.org) pull all updates, including rpm changes, into its own directories. This in turn is written to a rw afs volume, which is then released to all ro copies. All other client machines then pull their directory copy via rsync into their own certificates directory.

Details of the Several Steps

  • Every 6 hours at h:0 a check is made on gate01.aglt2.org for a new rpm of certificates
    • /etc/cron.d/osg-ca-certs-updater from the OSG distribution
  • At h:50 on gate01.aglt2.org fetch-crl is run via cron
    • /etc/cron.d/fetch-crl, set up by the OSG rpm, probably modified by AGLT2 to run at this time
  • At h:14 on gate01.aglt2.org the full directory is rsync'd into /afs/.atlas.umich.edu/OSG_certificates, the rw volume of this set
    • /etc/cron.d/rsync-certificates-into-afs that runs /root/tools/rsync-certificates-into-afs.sh
  • At h:41 on linat06, the home of the OSG_certificates rw volume, the rw volume is released to the ro copies
    • /etc/cron.d/release_Certificates
    • Keeps an accumulating log in /var/log/afs_release_Certificates.log
  • In the interval h:25 to h:40, all but the various dCache machines rsync their certificates directory out of afs
    • /etc/cron.d/rsync-certificates.cron, which runs /root/tools/rsync-certificates.sh
  • At h:20 the various dCache macines rsync their certificates directory out of afs
    • Same cron task, same tools file, just different time
Probably this cycle time could be shortened by changing the time the rw volume is released on linat06 to h:15, and the gate01 copy into afs to h:10. Such a change could accomplish a full certificate distribution in about 50 minutes or so.

Request and Update host certificates

In order to request a host certificate, one should first use the osg tools to generate the certificate request file, then either go to the UM WasUP page or MSU service portal to make the request with the certificate request file.

Generate the host request file

There is a script to generate the request file:
/atlas/data08/manage/cluster/hostcert_request/hostcert_req.sh hostname [alternative_hostname]


#!/bin/bash
ext=""
if [ $# -gt 0 ];then
 hn=$1;
 shift
else
 echo $0 hostname [alternative_hostname]
 exit 1
fi

if [ $# -gt 0 ];then
 while [ $# -gt 0 ] 
 do
  alt_hn=$1
  shift
  ext=$ext"--altname $alt_hn "
 done
fi

cmd="osg-cert-request  --hostname $hn --country US --state Michigan --locality 'Ann Arbor' --organization 'University of Michigan' "$ext
eval $cmd

Note: The script needs to be run on gate01.aglt2.org, where the osg tools are installed.. Details of the OSG tools can be checked here

The output are 2 files, for example, for the host aglbatch.aglt2.org, the output files are aglbatch.aglt2.org-key.pem and aglbatch.aglt2.org-key.req

copy the content of the aglbatch.aglt2.org-key.req file, and use that to go to the UM WasUP page to request for the new host certificate.

Request the host cert (Incommon certificate issued by IGTF server)

From UM WasUP page, filling the forms there, and submit the request.Every year they require the domain name (aglt2.org) validation. For that, they sent us a CNAME entry, and we publish it to merit. An alternavie option is to publish a text file on our www.aglt2.org server: http://aglt2.org/.well-known/pki-validation/328504995F0D2A2D58FE5D271D3E5594.txt

When requsting the hostcert, in the comment area, indicating the host cert needs to be issued by InCommon IGTF Server CA, and the maximum period of validity is 395 days. Otherwise the issuer by default is InCommon RSA Server CA..

Once the host cert if approved, the requested get an email, then go back to the UM WasUP page, and on the right pannel of the page, click on the "settings" of host name that you made the request, it will display your request, and the content of the host cerficate file.

For RSA certs via MSU, use Self Service Portal. login with your msunet id (without .msu.edu) and password. select type Apache/ModSSL and SSL Certificate. After submitting, you will get 2 email about opening and closing the ticket, 1 about the submission to InCommon, 1 from InCommon awaiting approval, 1 when it is approved, and 1 with links to the certificate files. We use the link for the Certificate only, PEM encoded file.

For IGTF certs via MSU, contact Ryan Lewis <lewisry2@msu.edu>. Then offer to email a text file with all the machines name and matching certificate request, one set after the other, all in one long text file. He has been happy ingesting them that way. You will then get the similar 3 emails from InCommon for each cert.

Download the host cert and place it on the central location

Copy the content of the host cerificate file, and paste to a new file named as
 /atlas/data08/manage/cluster/hostcert_request/aglbatch.aglt2.org.pem

verify the host cert:
openssl x509 -noout -enddate -subject -in aglbatch.aglt2.org

copy both files to the central place where we put the host certicate (agltbatch.aglt2.org:/root/hostcert)
 /atlas/data08/manage/cluster/hostcert_request/aglbatch.aglt2.org.pem
 /atlas/data08/manage/cluster/hostcert_request/aglbatch.aglt2.org-key.pem

Update host certicate via cfengine

All the host certificates and user certificates for services are stored centrally in aglbatch:/root/hostcert,

any updates of any of these certificates should be placed here first, and then to be distributed to cfengine server.

1) on aglbatch, create a tar ball of the /root/hostcert directory
cd /root; tar czvf hostcert.tar.gz hostcert

2) Copy agltbatch:/root/hostcert.tar.gz to the cfengine servers (umcfe and msucfe)

3) Repeat the following steps on umcfe and msucfe
cd /var/cfengine/policy/T2/stash
tar xzvf /root/hostcert.tar.gz -C .

4) update the host cert on the host
cf-agent -Kf failsafe.cf;cf-agent -K -b hostcert

5) Verify that the host cert is update
[root@aglbatch grid-security]# openssl x509 -in hostcert.pem -noout -subject -enddate
subject= /C=US/postalCode=48109/ST=MI/L=Ann Arbor/street=530 S. State St./O=University of Michigan/OU=Information Technology Services/CN=aglbatch.aglt2.org
notAfter=Apr 30 23:59:59 2021 GMT

User certifcate for services

a valid user certificate with atlas vo production role is being used for various services. Currently we use Wenjing's user certificate, and it is placed in aglbatch:/root/hostcert.

The user cert and key pair are renamed as:

xrootd_usercert.pem

xrootd_userkey.pem

Also,

xrootd_scrt stores the password of the private key.

They are distributed to the servers as described above.

A list of servers which uses this user certificate/key pair include:

head02:/root/.globus

gate02:/var/rsv/.globus

dcdum01:/var/lib/dcache/.globus

dcdmsu:/var/lib/dcache/.globus

Note: Please update user certificate in the above places when there is a renewal of the user certificate.

-- BobBall - 04 Oct 2018
Topic revision: r6 - 18 May 2021, PhilippeLaurens
This site is powered by FoswikiCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Foswiki? Send feedback