Controlling the ATLAS Queues, and the pilot rate

Much of the basic command structure is documented in this document. There is also a newer document about setting the queue states here. But we have implemented a local cron script for dynamically setting the total number of queued pilots waiting for a task, based upon the script written by Charles Waldman.

Setting the queue states

We have 2 queues here, AGLT2-condor (production), and ANALY_AGLT2-condor (analysis). A valid proxy must be used for these commands to work. So, for example, following my own "grid-proxy-init" I can issue the following command set:, where Q_state is one of setonline, setoffline, or settest.
curl --cert /tmp/x509up_u`id -u` --cacert /tmp/x509up_u`id -u` --capath /etc/grid-security/certificates ''

Note that the newer document above also suggests adding the following to the curl command:
Where THE.ELOG.NUMBER is one of the the relevant eLog or GGUS entries.

We will only set our own queue state if we are doing some local testing. Panda shifters are responsible for bringing us back online after a problem is noted, and must first send us test pilots that are confirmed to work.

Recent (2011) changes in handling the Analysis queue testing now favor setting the queue to brokeroff with comment HC.Test.Me . This allows HammerCloud to test and validate the site before jobs may resume. Following is the syntax of such a command. Syntax changed 9/17/2013 for SL6.
curl --cert /tmp/x509up_u`id -u` --cacert /tmp/x509up_u`id -u` --capath /etc/grid-security/certificates ''

-- BobBall - 12 Mar 2009
Topic revision: r10 - 17 Sep 2013 - 18:54:53 - BobBall

This site is powered by FoswikiCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Foswiki? Send feedback