I recently goofed and told someone that they could use the Qpid Management Framework (QMF) to submit jobs to Condor. What I meant to say is they can use AMQP. This is maybe understandable because QMF is a management framework built on top of AMQP, and MRG Grid already has many parts of Condor modeled in QMF, but submission via QMF could be very different than via AMQP.
QMF is a framework that allows for the modeling of objects that can publish information about themselves as well as respond to actions. All information and control is sent via AMQP messages.
Along with a quick correction to my comment, s/QMF/AMQP/, I went ahead and mocked up a QMF submission interface to make my comment almost true.
Existing Submission Interfaces
Condor already has a number of submission interfaces: the command-line tools, e.g.
condor_submit; a GAHP interface, the
condor_c-gahp; a SOAP interface, once termed Birdbath; the previously mentioned AMQP interface; and a few others. So, what’s one more? Or, why one more!?
The command-line interface is the default means for submitting jobs to Condor’s Scheduler, the
condor_submit tool takes a job description file, performs some processing on it, and generates one or many ClassAds representing jobs, a.k.a job ads. The condor_schedd only cares about job ads, and is never exposed to the job description file. condor_submit’s processing is sometimes shallow, e.g.
executable = /bin/true becomes
Cmd = "/bin/true", and sometimes not, e.g.
getenv = TRUE becomes
Environment = "<contents of env for condor_submit>". Sometimes the processing is even iterative in nature, e.g.
queue 1000000 generates one million copies of the job constructed since the last queue command. The job description file is really a script in the condor_submit language that generates jobs. This makes the condor_submit tool thick, and optimizations that it performs requires it to be tightly integrated with condor_schedd.
The SOAP interface (starts slide 15) is very different from condor_submit. It is implemented within the condor_schedd, and exposes a transactional interface that accepts job ads as input. This means no high level job description file processing. It also means the thick condor_submit tool could be implemented on top of the SOAP interface. A job ad that might be submitted via SOAP would look like:
[Owner="matt"; Cmd="/bin/echo"; Arguments="Hello there!"; JobUniverse=5; ...; Requirements=TRUE]
This is a job ad that may have been created from a job description file like:
executable = /bin/echo arguments = Hello there! requirements = TRUE queue 1
Pass that to
condor_submit -dump job.ad to have a look.
A QMF Interface
So, what about a QMF submission interface. A nice aspect of the condor_submit interface is the script nature of the input. Unfortunately, there are some things that cannot be cleanly captured on the remote side of a submission, e.g. the
transfer_input_files, platform specific
requirements bits, or working directory information. To some extent these reasons, and the desire to keep script processing out of the condor_schedd, is why the SOAP interface only deals in job ads. It’s also a reason why a QMF interface should only handle job ads.
A benefit of the SOAP interface is, quite obviously, that it makes for a more natural programmatic interface. Unfortunately, it also exposes concepts and optimizations that are used by condor_submit and may not be needed by other submission programs, e.g. transactions and clusters.
One thing that is an afterthought when using both interfaces is the notion of a submission, something that binds together jobs based on their overall purpose. Often a cluster is thought of as the means to group jobs. However, a single job description file can generate multiple clusters. Likewise, the SOAP interface can allow for group all jobs into a single cluster, but if one of the jobs is a DAGMan workflow then the point of the single cluster is violated. The use of clusters to associate jobs is broken.
Two things the QMF interface can do are: 1) simplify the operations required to perform a submission; and, 2) motivate its users to materialize the notion of a submission.
A QMF submission API
submit, ClassAd -> Id : Submit a new job described by ClassAd
create, void -> Id : Create a transaction to submit data and a job ad send, Id x Data -> void : Spool data for a forthcoming job ad
This interface would be a great simplification over the SOAP API. It eliminates the necessity of a transaction and chunked data transfers, and it does not expose the notion of a cluster. Without a cluster, job association must be done in some other way. The natural way is via an attribute on job ads, including DAGMan jobs. All jobs in a submission could have an attribute
Submission = "Monday Parameter Sweep Run, features: A, B, D", a
+Submission = "Monday..." in a job description file.
This interface does not have some of the high level niceties of a condor_submit submission. However, those niceties are not necessarily the ability to do many things with one line, e.g.
queue 100, but to have a well defined description of a job. Understanding
executable becomes the
Cmd attribute is one thing, knowing
universe = vanilla becomes
JobUniverse = 5 is significantly different. Shortcomings in the high level interface can be addressed with improved specification for a job ad.