Service as a Job: The Qpid C++ Broker

Yes. A service as a job! Why? Three quick reasons: 1) dynamic, even on-demand/opportunistic, deployment of the service, 2) policy driven control of the service’s execution, 3) abstraction for interacting with service life-cycle

Condor provides strong management, deployment and policy features around what it calls jobs. Jobs come in all shapes and sizes – from those that run for less than a minute (probabilistic simulations) to those that run for months (VMs holding developer desktops) or those that use large amounts of disk or network I/O to those that use large amounts of CPU and memory.

Definitely in that spectrum you’ll find common services, be they full LAMP stacks in VMs or just the Apache HTTP server. Here’s an example of the Qpid broker (qpidd), a messaging service, as a job.

The job description is what you submit with condor_submit:

cmd = qpidd.sh
error = qpidd.log

kill_sig = SIGTERM

# Want chirp functionality
+WantIOProxy = TRUE

queue

It specifies the job, or in this case the service, to run is qpidd.sh, and that SIGTERM should be used to shut it down. qpidd.sh wraps the actual execution of qpidd for one important reason: advertising the qpidd’s endpoint. qpidd will start up on port 5672 by default. That’s all well and good, unless you want to run more than one qpidd on a single machine. qpidd.sh start qpidd up on an ephemeral port, which qpidd kindly prints to stdout, and then advertises the chosen port number back into the Schedd’s queue via condor_chirp, which is available when the job specifies WantIOProxy = TRUE.

#!/bin/sh

# qpidd lives in /usr/sbin,
# condor_chirp in /usr/libexec/condor
export PATH=$PATH:/usr/sbin:/usr/libexec/condor

# When we get SIGTERM, which Condor will send when
# we are kicked, kill off qpidd.
function term {
    rm -f port.out
    kill %1
}

# Spawn qpidd, and make sure we can shut it down cleanly.
rm -f port.out
trap term SIGTERM
# qpidd will print the port to stdout, capture it,
# no auth required, don't read /etc/qpidd.conf,
# log to stderr
qpidd --auth no \
      --config /dev/null \
      --log-to-stderr yes \
      --no-data-dir \
      --port 0 \
      1> port.out &

# We might have to wait for the port on stdout
while [ ! -s port.out ]; do sleep 1; done
PORT=$(cat port.out)
rm -f port.out

# There are all sorts of useful things that could
# happen here, such as setting up queues with
# qpid-config
#...

# Record the port number where everyone can see it
condor_chirp set_job_attr QpiddEndpoint \"$HOSTNAME:$PORT\"

# Nothing more to do, just wait on qpidd
wait %1

In action –

$ condor_submit qpidd.sub
Submitting job(s).
1 job(s) submitted to cluster 2.

$ condor_q
-- Submitter: woods :  : woods
 ID      OWNER            SUBMITTED     RUN_TIME ST PRI SIZE CMD               
   2.0   matt            5/18 14:21   0+00:00:03 R  0   0.0  qpidd.sh          
1 jobs; 0 idle, 1 running, 0 held

$ condor_q -format "qpidd running at %s\n" QpiddEndpoint
qpidd running at woods:58335

$ condor_hold 2
Cluster 2 held.
$ condor_release 2
Cluster 2 released.

$ condor_q
-- Submitter: woods :  : woods
 ID      OWNER            SUBMITTED     RUN_TIME ST PRI SIZE CMD               
   2.0   matt            5/18 14:21   0+00:00:33 I  0   73.2 qpidd.sh          
1 jobs; 1 idle, 0 running, 0 held

$ condor_reschedule 
Sent "Reschedule" command to local schedd

$ condor_q         
-- Submitter: woods :  : woods
 ID      OWNER            SUBMITTED     RUN_TIME ST PRI SIZE CMD               
   2.0   matt            5/18 14:21   0+00:00:38 R  0   73.2 qpidd.sh          
1 jobs; 0 idle, 1 running, 0 held

$ condor_q -format "qpidd running at %s\n" QpiddEndpoint
qpidd running at woods:54028

$ condor_rm -a
All jobs marked for removal.

$ condor_submit qpidd.sub
Submitting job(s).
1 job(s) submitted to cluster 9.

$ condor_submit qpidd.sub                               
Submitting job(s).
1 job(s) submitted to cluster 10.

$ condor_submit qpidd.sub                               
Submitting job(s).
1 job(s) submitted to cluster 11.

$ lsof -i | grep qpidd
qpidd     14231 matt    9u  IPv4 92241655       TCP *:50060 (LISTEN)
qpidd     14256 matt    9u  IPv4 92242129       TCP *:50810 (LISTEN)
qpidd     14278 matt    9u  IPv4 92242927       TCP *:34601 (LISTEN)

$ condor_q -format "qpidd running at %s\n" QpiddEndpoint
qpidd running at woods:34601
qpidd running at woods:50810
qpidd running at woods:50060
Advertisements

Tags: , , ,

5 Responses to “Service as a Job: The Qpid C++ Broker”

  1. Freight Broker Training Says:

    Very informative site. Do happen to know any site that includes free download of the latest Qpid? I’ll be waiting for your reply. Thanks!

  2. Service as a Job: The Tomcat App Server « Spinning Says:

    […] seen previously, anything with a life-cycle to be managed can be turned into a job. This time around it’s the […]

  3. Service as a Job: Memcached « Spinning Says:

    […] services such as Tomcat or Qpidd show how to schedule and manage a service’s life-cycle via Condor. It is also possible to […]

  4. Service as a Job: HDFS DataNode « Spinning Says:

    […] on other examples of services run as jobs, such as Tomcat, Qpidd and memcached, here is an example for the Hadoop Distributed File System‘s […]

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s


%d bloggers like this: