Posts Tagged ‘Tutorial’

Hadoop on OpenStack with a CLI: Creating a cluster

January 29, 2014

OpenStack Savanna can already help you create a Hadoop cluster or run a Hadoop workload all through the Horizon dashboard. What it could not do until now is let you do that through a command-line interface.

Part of the Savanna work for Icehouse is to create a savanna CLI. It extends the Savanna functionality as well as gives us an opportunity to review the existing v1.0 and v1.1 REST APIs in preparation for a stable v2 API.

A first pass of the CLI is now done and functional for at least the v1.0 REST API. And here’s how you can use it.

Zeroth, get your hands on the Savanna client. Two places to get it are RDO and the OpenStack tarballs.

First, know that the Savanna architecture includes a plugin mechanism to allow for Hadoop vendors to plug in their own management tools. This is a key aspect of Savanna’s vendor appeal. So you need to pick a plugin to use.

$ savanna plugin-list
+---------+----------+---------------------------+
| name    | versions | title                     |
+---------+----------+---------------------------+
| vanilla | 1.2.1    | Vanilla Apache Hadoop     |
| hdp     | 1.3.2    | Hortonworks Data Platform |
+---------+----------+---------------------------+

I chose to try the Vanilla plugin, version 1.2.1. It’s the reference implementation,

export PLUGIN_NAME=vanilla
export PLUGIN_VERSION=1.2.1

Second, you need to make some decisions about the Hadoop cluster you want to start. I decided to have a master node using the m1.medium flavor and three worker nodes also using m1.medium.

export MASTER_FLAVOR=m1.medium
export WORKER_FLAVOR=m1.medium
export WORKER_COUNT=3

Third, I decided to use Neutron networking in my OpenStack deployment, it’s what everyone is doing these days. As a result, I need a network to start the cluster on.

$ neutron net-list
+---------------+------+-----------------------------+
| id            | name | subnets                     |
+---------------+------+-----------------------------+
| 25783...f078b | net0 | 18d12...5f903 10.10.10.0/24 |
+---------------+------+-----------------------------+
export MANAGEMENT_NETWORK=net0

The cluster will be significantly more useful if I have a way to access it, so I need to pick a keypair for access.

$ nova keypair-list
+-----------+-------------------------------------------------+
| Name      | Fingerprint                                     |
+-----------+-------------------------------------------------+
| mykeypair | ac:ad:1d:f7:97:24:bd:6e:d7:98:50:a2:3d:7d:6c:45 |
+-----------+-------------------------------------------------+
export KEYPAIR=mykeypair

And I need an image to use for each of the nodes. I chose a Fedora image that was created using the Savanna DIB elements. You can pick one from the Savanna Quickstart guide,

$ glance image-list
+---------------+----------------+-------------+------------------+------------+--------+
| ID            | Name           | Disk Format | Container Format | Size       | Status |
+---------------+----------------+-------------+------------------+------------+--------+
| 1939b...f05c2 | fedora_savanna | qcow2       | bare             | 1093453824 | active |
+---------------+----------------+-------------+------------------+------------+--------+
export IMAGE_ID=1939bad7-11fe-4cab-b1b9-02b01d9f05c2

then register it with Savanna,

savanna image-register --id $IMAGE_ID --username fedora
savanna image-add-tag --id $IMAGE_ID --tag $PLUGIN_NAME
savanna image-add-tag --id $IMAGE_ID --tag $PLUGIN_VERSION
$ savanna image-list
+----------------+---------------+----------+----------------+-------------+
| name           | id            | username | tags           | description |
+----------------+---------------+----------+----------------+-------------+
| fedora_savanna | 1939b...f05c2 | fedora   | vanilla, 1.2.1 | None        |
+----------------+---------------+----------+----------------+-------------+

FYI, --username fedora tells Savanna what account it can access on the instance that has sudo privileges. Adding the tags tells Savanna what plugin and version the image works with.

That’s all the input you need to provide. From here on the cluster creation is just a little more cut and pasting of a few commands.

First, a few commands to find IDs for the named values chosen above,

export MASTER_FLAVOR_ID=$(nova flavor-show $MASTER_FLAVOR | grep ' id ' | awk '{print $4}')
export WORKER_FLAVOR_ID=$(nova flavor-show $WORKER_FLAVOR | grep ' id ' | awk '{print $4}')
export MANAGEMENT_NETWORK_ID=$(neutron net-show net0 | grep ' id ' | awk '{print $4}')

Next, create some node group templates for the master and worker nodes. The CLI currently takes a JSON representation of the template. It also provides a JSON representation when showing template details to facilitate export & import.

export MASTER_TEMPLATE_ID=$(echo "{\"plugin_name\": \"$PLUGIN_NAME\", \"node_processes\": [\"namenode\", \"secondarynamenode\", \"oozie\", \"jobtracker\"], \"flavor_id\": \"$MASTER_FLAVOR_ID\", \"hadoop_version\": \"$PLUGIN_VERSION\", \"name\": \"master\"}" | savanna node-group-template-create | grep ' id ' | awk '{print $4}')

export WORKER_TEMPLATE_ID=$(echo "{\"plugin_name\": \"$PLUGIN_NAME\", \"node_processes\": [\"datanode\", \"tasktracker\"], \"flavor_id\": \"$WORKER_FLAVOR_ID\", \"hadoop_version\": \"$PLUGIN_VERSION\", \"name\": \"worker\"}" | savanna node-group-template-create | grep ' id ' | awk '{print $4}')

Now put those two node group templates together into a cluster template,

export CLUSTER_TEMPLATE_ID=$(echo "{\"plugin_name\": \"$PLUGIN_NAME\", \"node_groups\": [{\"count\": 1, \"name\": \"master\", \"node_group_template_id\": \"$MASTER_TEMPLATE_ID\"}, {\"count\": $WORKER_COUNT, \"name\": \"worker\", \"node_group_template_id\": \"$WORKER_TEMPLATE_ID\"}], \"hadoop_version\": \"$PLUGIN_VERSION\", \"name\": \"cluster\"}" | savanna cluster-template-create | grep ' id ' | awk '{print $4}')

Creating the node group and cluster templates only has to happen once, the final step, starting up the cluster, can be done multiple times.

echo "{\"cluster_template_id\": \"$CLUSTER_TEMPLATE_ID\", \"default_image_id\": \"$IMAGE_ID\", \"hadoop_version\": \"$PLUGIN_VERSION\", \"name\": \"cluster-instance-$(date +%s)\", \"plugin_name\": \"$PLUGIN_NAME\", \"user_keypair_id\": \"$KEYPAIR\", \"neutron_management_network\": \"$MANAGEMENT_NETWORK_ID\"}" | savanna cluster-create 

That’s it. You can nova list and ssh into the master instance, assuming you’re on the Neutron node and use ip netns exec, or you can login through the master node’s VNC console.

Hello Fedora with docker in 3 steps

December 10, 2013

It really is this simple,

1. sudo yum install -y docker-io

2. sudo systemctl start docker

3. sudo docker run mattdm/fedora cat /etc/system-release

Bonus, for when you want to go deeper -

If you don’t want to use sudo all the time, which you shouldn’t want to do, you add yourself to the docker group,

$ sudo usermod -a -G docker $USER

If you don’t want to log out and back in, make your new group effective immediately,

$ su - $USER
$ groups | grep -q docker && echo Good job || echo Try again

If you want to run a known image, search for it on https://index.docker.io or on the command line,

$ docker search fedora

Try out a shell with,

$ docker run -i -t mattdm/fedora /bin/bash

Concurrency Limits: Group defaults

January 21, 2013

Concurrency limits allow for protecting resources by providing a way to cap the number of jobs requiring a specific resource that can run at one time.

For instance, limit licenses and filer access at four regional data centers.

CONCURRENCY_LIMIT_DEFAULT = 15
license.north_LIMIT = 30
license.south_LIMIT = 30
license.east_LIMIT = 30
license.west_LIMIT = 45
filer.north_LIMIT = 75
filer.south_LIMIT = 150
filer.east_LIMIT = 75
filer.west_LIMIT = 75

Notice the repetition.

In addition to the repetition, every license.* and filer.* must be known and recorded in configuration. The set may be small in this example, but imagine imposing a limit on each user or each submission. The set of users is board, dynamic and may differ by region. The set of submissions is a more extreme version of the users case, yet it is still realistic.

To simplify the configuration management for groups of limits, a new feature to provide group defaults to limit was added for the Condor 7.8 series.

The feature requires that only the exception to a rule be called out explicitly in configuration. For instance, license.west and filer.south are the exceptions in the configuration above. Simplified configuration available in 7.8,

CONCURRENCY_LIMIT_DEFAULT = 15
CONCURRENCY_LIMIT_DEFAULT_license = 30
CONCURRENCY_LIMIT_DEFAULT_filer = 75
license.west_LIMIT = 45
filer.south_LIMIT = 150

In action,

$ for limit in license.north license.south license.east license.west filer.north filer.south filer.east filer.west; do echo queue 1000 | condor_submit -a cmd=/bin/sleep -a args=1d -a concurrency_limits=$limit; done

$ condor_q -format '%s\n' ConcurrencyLimits -const 'JobStatus == 2' | sort | uniq -c | sort -n
     30 license.east
     30 license.north
     30 license.south
     45 license.west
     75 filer.east
     75 filer.north
     75 filer.west
    150 filer.south

Extensible machine resources

November 19, 2012

Physical machines are home to many types of resources these days. The traditional cores, memory, disk, now share space with gpus, co-processors or even protein sequence analysis accelerators.

To facilitate use and management of these resources, a new feature is available in HTCondor for extending machine resources. Analogous to concurrency limits, which operate on a pool / global level, machine resources operate on a machine / local level.

The feature allows a machine to advertise that it has specific types of resources available. Jobs can then specify that they require those specific types of resources. And the matchmaker will take into account the new resource types.

By example, a machine may have some GPU resources, an RS232 connected to your favorite telescope, and a number of physical spinning hard disk drives. The configuration for this would be,

MACHINE_RESOURCE_NAMES = GPU, RS232, SPINDLE
MACHINE_RESOURCE_GPU = 2
MACHINE_RESOURCE_RS232 = 1
MACHINE_RESOURCE_SPINDLE = 4

SLOT_TYPE_1 = cpus=100%,auto
SLOT_TYPE_1_PARTITIONABLE = TRUE
NUM_SLOTS_TYPE_1 = 1

Aside – cpus=100%,auto instead of just auto because of GT3327. Also, the configuration for SLOT_TYPE_1 will likely go away in the future when all slots are partitionable by default.

Once a machine with this configuration is running,

$ condor_status -long | grep -i MachineResources
MachineResources = "cpus memory disk swap gpu rs232 spindle"

$ condor_status -long | grep -i -e TotalCpus -e TotalMemory -e TotalGpu -e TotalRs232 -e TotalSpindle
TotalCpus = 24
TotalMemory = 49152
TotalGpu = 2
TotalRs232 = 1
TotalSpindle = 4

$ condor_status -long | grep -i -e ^Cpus -e ^Memory -e ^Gpu -e ^Rs232 -e ^Spindle
Cpus = 24
Memory = 49152
Gpu = 2
Rs232 = 1
Spindle = 4

As you can see, the machine is reporting the different types of resources, how many of each it has and how many are currently available.

A job can take advantage of these new types of resources using a syntax already familiar for requesting resources from partitionable slots.

To consume one of the GPUs,

cmd = luxmark.sh

request_gpu = 1

queue

Or for a disk intensive workload,

cmd = hadoop_datanode.sh

request_spindle = 1

queue

With these jobs submitted and running,

$ condor_status
Name            OpSys      Arch   State     Activity LoadAv Mem ActvtyTime

slot1@eeyore    LINUX      X86_64 Unclaimed Idle      0.400 48896 0+00:00:28
slot1_1@eeyore  LINUX      X86_64 Claimed   Busy      0.000  128 0+00:00:04
slot1_2@eeyore  LINUX      X86_64 Claimed   Busy      0.000  128 0+00:00:04
                     Machines Owner Claimed Unclaimed Matched Preempting
        X86_64/LINUX        3     0       2         1       0          0
               Total        3     0       2         1       0          0

$ condor_status -l slot1@eeyore | grep -i -e ^Cpus -e ^Memory -e ^Gpu -e ^Rs232 -e ^Spindle
Cpus = 22
Memory = 48896
Gpu = 1
Rs232 = 1
Spindle = 3

That’s 22 cores, 1 gpu and 3 spindles still available.

Submit four more of the spindle consuming jobs and you’ll find the fourth does not run, because the available number of spindles is 0.

$ condor_status -l slot1@eeyore | grep -i -e ^Cpus -e ^Memory -e ^Gpu -e ^Rs232 -e ^Spindle
Cpus = 19
Memory = 48512
Gpu = 1
Rs232 = 1
Spindle = 0

Since these custom resources are available as attributes in various ClassAds the same way Cpu, Memory and Disk are, all the policy, management and reporting capabilities you would expect is available.

Manage inventory with Wallaby

January 16, 2012

Wallaby will manage your configuration, as well as an inventory of your machines. It can differentiate between machines that are expected to be present and those that opportunistically appear.

Build the roster with wallaby add-node -

$ wallaby add-node node0.local node1.local node2.local
Adding the following node: node0.local
Console Connection Established...
Adding the following node: node1.local
Adding the following node: node2.local
$ for i in $(seq 3 10); do wallaby add-node node$i.local; done
Adding the following node: node3.local
Console Connection Established...
Adding the following node: node4.local
Console Connection Established...
...

List expected nodes (provisioned) -

$ wallaby inventory
Console Connection Established...
P        Node name                 Last checkin
-        ---------                 ------------
+      node0.local Wed Jan 11 07:32:33 -0500 20
+      node1.local Thu Jan 05 12:15:00 -0500 20
+     node10.local Wed Jan 11 07:31:56 -0500 20
+      node2.local Wed Jan 11 07:31:56 -0500 20
+      node3.local Wed Jan 11 07:15:21 -0500 20
+      node4.local Wed Jan 11 07:31:42 -0500 20
+      node5.local Wed Jan 11 07:16:47 -0500 20
+      node6.local                        never
+      node7.local Wed Jan 11 07:32:33 -0500 20
+      node8.local Wed Jan 11 07:32:33 -0500 20
+      node9.local Wed Jan 11 07:30:47 -0500 20
-      robin.local Thu Dec 15 14:11:35 -0500 20
-      woods.local Tue Jan 10 20:33:47 -0500 20

List opportunistic, bonus nodes (unprovisioned) -

$ wallaby inventory -o unprovisioned
Console Connection Established...
P        Node name                 Last checkin
-        ---------                 ------------
-      robin.local Thu Dec 15 14:11:35 -0500 20
-      woods.local Tue Jan 10 20:33:47 -0500 20

Provisioned nodes that have never checked in, maybe setup failed -

$ wallaby inventory -c 'last_checkin == 0 && provisioned'
Console Connection Established...
P        Node name                 Last checkin
-        ---------                 ------------
+      node6.local                        never

Provisioned node that have not checked in for the past 4 hours, maybe machine is down -

$ wallaby inventory -c 'last_checkin > 0 && last_checkin < 4.hours_ago && provisioned'
Console Connection Established...
P        Node name                 Last checkin
-        ---------                 ------------
+      node1.local Thu Jan 05 12:15:00 -0500 20

Unprovisioned nodes that have not checked in for 48 hours, candidates for wallaby remove-node -

$ wallaby inventory -c 'last_checkin < 48.hours_ago && !provisioned'
Console Connection Established...
P        Node name                 Last checkin
-        ---------                 ------------
-      robin.local Thu Dec 15 14:11:35 -0500 20 

Enjoy.

Getting started: Condor and EC2 – EC2 execute node

November 10, 2011

We have been over starting and managing instances from Condor, using condor_ec2_q to help, and importing existing instances. Here we will cover extending an existing pool using execute nodes run from EC2 instances. We will start with an existing pool, create an EC2 instance, configure the instance to run condor, authorize the instance to join the existing pool, and run a job.

Let us pretend that the node running your existing pool’s condor_collector and condor_schedd is called condor.condorproject.org.

These instructions will require bi-directional connectivity between condor.condorproject.org and your EC2 instance. condor.condorproject.org must be connected to the internet with a publically routable address. Also, ports must be open in its firewall for the Collector and Schedd. The EC2 execute nodes have to be able to connect to condor.condorproject.org to talk to the condor_collector and condor_schedd. It cannot be behind a NAT or firewall. Okay, let’s start.

I am going to use ami-60bd4609, a publically available Fedora 15 AMI. You can either start the instance via the AWS console, or submit it by following previous instructions.

Once the instance is up and running, login and sudo yum install condor. Note, until BZ656562 is resolved, you will have to sudo mkdir /var/run/condor; sudo chown condor.condor /var/run/condor before starting condor. Start condor with sudo service condor start to get a personal condor.

Configuring condor on the instance is very similar to creating a multiple node pool. You will need to set the CONDOR_HOST, ALLOW_WRITE, and DAEMON_LIST,

# cat > /etc/condor/config.d/40execute_node.config
CONDOR_HOST = condor.condorproject.org
DAEMON_LIST = MASTER, STARTD
ALLOW_WRITE = $(ALLOW_WRITE), $(CONDOR_HOST)
^D

If you do not give condor.condorproject.org WRITE permissions, the Schedd will fail to start jobs. StartLog will report,

PERMISSION DENIED to unauthenticated@unmapped from host 128.105.291.82 for command 442 (REQUEST_CLAIM), access level DAEMON: reason: DAEMON authorization policy contains no matching ALLOW entry for this request; identifiers used for this host: 128.105.291.82,condor.condorproject.org, hostname size = 1, original ip address = 128.105.291.82

Now remember, we need bi-directional connectivity. So condor.condorproject.org must be able to connect to the EC2 instance’s Startd. The condor_start will listen on an ephemeral port by default. You could restrict it to a port range or use condor_shared_port. For simplicity, just force a non-ephemeral port of 3131,

# echo "STARTD_ARGS = -p 3131" >> /etc/condor/config.d/40execute_node.config

You can now open TCP port 3131 in the instance’s iptables firewall. If you are using the Fedora 15 AMI, the firewall is off by default and needs no adjustment. Additionally, the security group on the instance needs to have TCP port 3131 authorized. Use the AWS Console or ec2-authorize GROUP -p 3131.

If you miss either of these steps, the Schedd will fail to start jobs on the instance, likely with a message similar to,

Failed to send REQUEST_CLAIM to startd ec2-174-129-47-20.compute-1.amazonaws.com <174.129.47.20:3131>#1220911452#1#... for matt: SECMAN:2003:TCP connection to startd ec2-174-129-47-20.compute-1.amazonaws.com <174.129.47.20:3131>#1220911452#1#... for matt failed.

A quick service condor restart on the instance, and a condor_status on condor.condorproject.org would hopefully show the instance joined the pool. Except the instance has not been authorized yet. In fact, the CollectorLog will probably report,

PERMISSION DENIED to unauthenticated@unmapped from host 174.129.47.20 for command 0 (UPDATE_STARTD_AD), access level ADVERTISE_STARTD: reason: ADVERTISE_STARTD authorization policy contains no matching ALLOW entry for this request; identifiers used for this host: 174.129.47.20,ec2-174-129-47-20.compute-1.amazonaws.com
PERMISSION DENIED to unauthenticated@unmapped from host 174.129.47.20 for command 2 (UPDATE_MASTER_AD), access level ADVERTISE_MASTER: reason: ADVERTISE_MASTER authorization policy contains no matching ALLOW entry for this request; identifiers used for this host: 174.129.47.20,ec2-174-129-47-20.compute-1.amazonaws.com

The instance needs to be authorized to advertise itself into the Collector. A good way to do that is to add,

ALLOW_ADVERTISE_MASTER = $(ALLOW_WRITE), ec2-174-129-47-20.compute-1.amazonaws.com
ALLOW_ADVERTISE_STARTD = $(ALLOW_WRITE), ec2-174-129-47-20.compute-1.amazonaws.com

to condor.condorproject.org’s configuration and reconfig with condor_reconfig. A note here, ALLOW_WRITE is added in because I am assuming you are following previous instructions. If you have ALLOW_ADVERTISE_MASTER/STARTD already configured, you should append to them instead. Also, appending for each new instance will get tedious. You could be very trusting and allow *.amazonaws.com, but it is better to use SSL or PASSWORD authentication. I will describe that some other time.

After the reconfig, the instance will eventually show up in a condor_status listing.

$ condor_status
Name               OpSys      Arch   State     Activity LoadAv Mem   ActvtyTime
localhost.localdom LINUX      INTEL  Unclaimed Benchmar 0.430  1666  0+00:00:04

The name is not very helpful, but also not a problem.

It is time to submit a job.

$ condor_submit
Submitting job(s)
cmd = /bin/sleep
args = 1d
should_transfer_files = IF_NEEDED
when_to_transfer_output = ON_EXIT
queue
.^D
1 job(s) submitted to cluster 14.

$ condor_q
-- Submitter: condor.condorproject.org : <128.105.291.82:36900> : condor.condorproject.org
 ID      OWNER            SUBMITTED     RUN_TIME ST PRI SIZE CMD
  31.0   matt           11/11 11:11   0+00:00:00 I  0   0.0  sleep 1d
1 jobs; 1 idle, 0 running, 0 held

The job will stay idle forever, which is no good. The problem can be found in the SchedLog,

Enqueued contactStartd startd=<10.72.55.105:3131>
In checkContactQueue(), args = 0x9705798, host=<10.72.55.105:3131>
Requesting claim localhost.localdomain <10.72.55.105:3131>#1220743035#1#... for matt 31.0
attempt to connect to <10.72.55.105:3131> failed: Connection timed out (connect errno = 110).  Will keep trying for 45 total seconds (24 to go).

The root cause is that the instance has two internet addresses. A private one, which is not routable from condor.condorproject.org, that it is advertising,

$ condor_status -format "%s, " Name -format "%s\n" MyAddress
localhost.localdomain, <10.72.55.105:3131>

And a public one, which can be found from within the instance,

$ curl -f http://169.254.169.254/latest/meta-data/public-ipv4
174.129.47.20

Condor has a way to handle this. The TCP_FORWARDING_HOST configuration parameter can be set to the public address for the instance.

# echo "TCP_FORWARDING_HOST = $(curl -f http://169.254.169.254/latest/meta-data/public-ipv4)" >> /etc/condor/config.d/40execute_node.config

A condor_reconfig will apply the change, but a restart will clear out the old entry first. Oops. Note, you cannot set TCP_FORWARDING_HOST to the public-hostname of the instance, because the public hostname will be revolved within the instance and will resolve to the instance’s internal, private address.

When setting TCP_FORWARDING_HOST, also set PRIVATE_NETWORK_INTERFACE to let the host talk to itself over its private address.

# echo "PRIVATE_NETWORK_INTERFACE = $(curl -f http://169.254.169.254/latest/meta-data/local-ipv4)" >> /etc/condor/config.d/40execute_node.config

Doing so will prevent the condor_startd from using its public address to send DC_CHILDALIVE messages to the condor_master, which might fail because of a firewall or security group setting,

attempt to connect to <174.129.47.20:34550> failed: Connection timed out (connect errno = 110).  Will keep trying for 390 total seconds (200 to go).
attempt to connect to <174.129.47.20:34550> failed: Connection timed out (connect errno = 110).
ChildAliveMsg: failed to send DC_CHILDALIVE to parent daemon at <174.129.47.20:34550> (try 1 of 3): CEDAR:6001:Failed to connect to <174.129.47.20:34550>

Or if simply because the master does not trust the public address,

PERMISSION DENIED to unauthenticated@unmapped from host 174.129.47.20 for command 60008 (DC_CHILDALIVE), access level DAEMON: reason: DAEMON authorization policy contains no matching ALLOW entry for this request; identifiers used for this host: 174.129.47.20,ec2-174-129-47-20.compute-1.amazonaws.com, hostname size = 1, original ip address = 174.129.47.20

Now run that service condor restart and the public, routable address will be advertised,

$ condor_status -format "%s, " Name -format "%s\n" MyAddress
localhost.localdomain, <174.129.47.20:3131?noUDP>

The job will be started on the instance automatically,

$ condor_q -run
-- Submitter: condor.condorproject.org : <128.105.291.82:36900> : condor.condorproject.org
 ID      OWNER            SUBMITTED     RUN_TIME HOST(S)
  31.0   matt           11/11 11:11   0+00:00:11 localhost.localdomain

If you want to clean up the localhost.localdomain, set the instance’s hostname and restart condor,

$ sudo hostname $(curl -f http://169.254.169.254/latest/meta-data/public-hostname)
$ sudo service condor restart
(wait for the start to advertise)
$ condor_status -format "%s, " Name -format "%s\n" MyAddress
ec2-174-129-47-20.compute-1.amazonaws.com, <174.129.47.20:3131?noUDP>

In summary,

Configuration changes on condor.condorproject.org,

ALLOW_ADVERTISE_MASTER = $(ALLOW_WRITE), ec2-174-129-47-20.compute-1.amazonaws.com
ALLOW_ADVERTISE_STARTD = $(ALLOW_WRITE), ec2-174-129-47-20.compute-1.amazonaws.com

Setup on the instance,

# cat > /etc/condor/config.d/40execute_node.config
CONDOR_HOST = condor.condorproject.org
DAEMON_LIST = MASTER, STARTD
ALLOW_WRITE = $(ALLOW_WRITE), $(CONDOR_HOST)
STARTD_ARGS = -p 3131
^D
# echo "TCP_FORWARDING_HOST = $(curl -f http://169.254.169.254/latest/meta-data/public-ipv4)" >> /etc/condor/config.d/40execute_node.config
# echo "PRIVATE_NETWORK_INTERFACE = $(curl -f http://169.254.169.254/latest/meta-data/local-ipv4)" >> /etc/condor/config.d/40execute_node.config
# hostname $(curl -f http://169.254.169.254/latest/meta-data/public-hostname)

Getting started: Condor and EC2 – Starting and managing instances

October 31, 2011

Condor has the ability to start and manage the lifecycle of instances in EC2. The integration was released in early 2008 with version 7.1.

The integration started with users being able to upload AMIs to S3 and manage instances using the EC2 and S3 SOAP APIs. At the time, mid-2007, creating a useful AMI required so much user interaction that the complexity of supporting S3 AMI upload was not justified. The implementation settled on pure instance lifecycle management, a very powerful base and a core Condor strength.

A point of innovation during the integration was how to transactionally start instances. The instance’s security group (originally) and ssh keypair (finally), were used as a tracking key. This innovation turned into an RFE and eventually resulted in idempotent instance creation, a feature all Cloud APIs should support. In fact, all distributed resource management APIs should support it, more on this sometime.

Today, in Condor 7.7 and MRG 2, Condor uses the EC2 Query API via the ec2_gahp, and that’s our starting point. We’ll build a submit file, start an instance, get key metadata about the instance, and show how to control the instance’s lifecycle just like any other job’s.

First, the submit file,

universe = grid
grid_resource = ec2 https://ec2.amazonaws.com/

ec2_access_key_id = /home/matt/Documents/AWS/Cert/AccessKeyID
ec2_secret_access_key = /home/matt/Documents/AWS/Cert/SecretAccessKey

ec2_ami_id = ami-60bd4609
ec2_instance_type = m1.small

ec2_user_data = Hello $(executable)!

executable = EC2_Instance-$(ec2_ami_id)

log = $(executable).$(cluster).log

ec2_keypair_file = $(executable).$(cluster).pem

queue

The universe must be grid. The resource string is ec2 https://ec2.amazonaws.com, and the URL may be changed if a proxy is needed or possibly debugging with a redirect.

The ec2_access_key_id and ec2_secret_access_key are full paths to files containing your credentials for accessing EC2. These are needed so Condor can act on your behalf when talking to EC2. They need not and should not be world readable. Take a look at EC2 User Guide: Amazon EC2 Credentials for information on obtaining your credentials.

The ec2_ami_id and ec2_instance_type are required. They specify the AMI off which to base the instance and the type of instance to create, respectively. ami-60bd4609 is an EBS backed Fedora 15 image supported by the Fedora Cloud SIG. A list of instance types can be found in EC2 User Guide: Instance Families and Types. I picked m1.small because the AMI is 32-bit.

ec2_user_data is optional, but when provided gives the instance some extra data to act on when starting up. It is described in EC2 User Guide: Using Instance Metadata. This is an incredibly powerful feature, allowing parameterization of AMIs.

The executable field is simply a label here. It should really be called label or name and integrate with the AWS Console.

The log is our old friend the structured log of lifecycle events.

The ec2_keypair_file is the file where Condor will put the ssh keypair used for accessing the instance. This is a file instead of a keypair name because Condor generates a new keypair for each instance as part of tracking the instances. Eventually Condor should use EC2′s idempotent RunInstances.

Second, let’s submit the job,

$ condor_submit f15-ec2.sub            
Submitting job(s).
1 job(s) submitted to cluster 1710.

$ condor_q
-- Submitter: eeyore.local :  : eeyore.local
 ID      OWNER            SUBMITTED     RUN_TIME ST PRI SIZE CMD               
1710.0   matt           10/30 20:58   0+00:00:00 I  0   0.0  EC2_Instance-ami-6
1 jobs; 1 idle, 0 running, 0 held

Condor is starting up a condor_gridmanager, which is in turn starting up an ec2_gahp to communicate with EC2.

$ pstree | grep condor 
     |-condor_master-+-aviary_query_se
     |               |-condor_collecto---4*[{condor_collect}]
     |               |-condor_negotiat---4*[{condor_negotia}]
     |               |-condor_schedd-+-condor_gridmana---ec2_gahp---2*[{ec2_gahp}]
     |               |               |-condor_procd
     |               |               `-4*[{condor_schedd}]
     |               |-condor_startd-+-condor_procd
     |               |               `-4*[{condor_startd}]
     |               `-4*[{condor_master}]

Third, when the job is running the instance will also be started in EC2. Take a look at the log file, EC2_Instance-ami-60bd4609.1710.log, for some information. Also, the instance name and hostname will be available on the job ad,

$ condor_q -format "Instance name: %s\n" EC2InstanceName -format "Instance hostname: %s\n" EC2RemoteVirtualMachineName -format "Keypair: %s\n" EC2KeyPairFile
Instance name: i-7f37e31c
Instance hostname: ec2-184-72-158-77.compute-1.amazonaws.com
Keypair: /home/matt/Documents/AWS/EC2_Instance-ami-60bd4609.1710.pem

The instance name can be used with the AWS Console or ec2-describe-instances,

$ ec2-describe-instances i-7f37e31c
RESERVATION	r-f6592498	821108636519	default
INSTANCE	i-7f37e31c	ami-60bd4609	ec2-184-72-158-77.compute-1.amazonaws.com	ip-10-118-37-239.ec2.internal	running	SSH_eeyore.local_eeyore.local#1710.0#1320022728	0		m1.small	2011-10-31T00:59:01+0000	us-east-1c	aki-407d9529			monitoring-disabled	184.72.158.77	10.118.37.239			ebs					paravirtual	xen		sg-e5a18c8c	default
BLOCKDEVICE	/dev/sda1	vol-fe4aaf93	2011-10-31T00:59:24.000Z

The instance hostname along with the ec2_keypair_file will let us access the instance,

$ ssh -i /home/matt/Documents/AWS/EC2_Instance-ami-60bd4609.1710.pem ec2-user@ec2-184-72-158-77.compute-1.amazonaws.com
The authenticity of host 'ec2-184-72-158-77.compute-1.amazonaws.com (184.72.158.77)' can't be established.
RSA key fingerprint is f2:6e:da:bb:53:47:34:b6:2e:fe:63:62:a5:c8:a5:2e.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'ec2-184-72-158-77.compute-1.amazonaws.com,184.72.158.77' (RSA) to the list of known hosts.

Appliance:	Fedora-15 appliance 1.1
Hostname:	localhost.localdomain
IP Address:	10.118.37.239

[ec2-user@localhost ~]$ 

Notice that the Fedora instances use a default account of ec2-user, not root.

Also, the user data is available in the instance. Any program could read and act on it.

[ec2-user@localhost ~]$ curl http://169.254.169.254/latest/user-data
Hello EC2_Instance-ami-60bd4609!

Finally, controlling the instance’s lifecycle, simply issue condor_hold or condor_rm and the instance will be terminated. You can also run shutdown -H now in the instance. Here I’ll run sudo shutdown -H now.

[ec2-user@localhost ~]$ sudo shutdown -H now
Broadcast message from ec2-user@localhost.localdomain on pts/0 (Mon, 31 Oct 2011 01:11:55 -0400):
The system is going down for system halt NOW!
[ec2-user@localhost ~]$
Connection to ec2-184-72-158-77.compute-1.amazonaws.com closed by remote host.
Connection to ec2-184-72-158-77.compute-1.amazonaws.com closed.

You will notice that condor_q does not immediately reflect that the instance is terminated, even though ec2-describe-instances will. This is because Condor only polls for status changes in EC2 every 5 minutes by default. The GRIDMANAGER_JOB_PROBE_INTERVAL configuration param is the control.

In this case, the instance was shutdown at Sun Oct 30 21:12:52 EDT 2011 and Condor noticed at 21:14:40,

$ tail -n11 EC2_Instance-ami-60bd4609.1710.log
005 (1710.000.000) 10/30 21:14:40 Job terminated.
	(1) Normal termination (return value 0)
		Usr 0 00:00:00, Sys 0 00:00:00  -  Run Remote Usage
		Usr 0 00:00:00, Sys 0 00:00:00  -  Run Local Usage
		Usr 0 00:00:00, Sys 0 00:00:00  -  Total Remote Usage
		Usr 0 00:00:00, Sys 0 00:00:00  -  Total Local Usage
	0  -  Run Bytes Sent By Job
	0  -  Run Bytes Received By Job
	0  -  Total Bytes Sent By Job
	0  -  Total Bytes Received By Job
...

Bonus, use periodic_hold or periodic_remove to cap how long an instance can run. Add periodic_hold = (time() – ShadowBday) >= 60 to the submit file and your instance will be terminated, by Condor, after 60 seconds.

$ tail -n6 EC2_Instance-ami-60bd4609.1713.log
001 (1713.000.000) 10/30 21:33:39 Job executing on host: ec2 https://ec2.amazonaws.com/
...
012 (1713.000.000) 10/30 21:37:54 Job was held.
	The job attribute PeriodicHold expression '( time() - ShadowBday ) >= 60' evaluated to TRUE
	Code 3 Subcode 0
...

The instance was not terminated at exactly 60 seconds because the PERIODIC_EXPR_INTERVAL configuration defaults to 300 seconds, just like the GRIDMANAGER_JOB_PROBE_INTERVAL.

Imagine keeping your EC2 instance inventory in Condor. Condor’s policy engine and extensible metadata for jobs automatically extend to instances running in EC2.

Getting Started: Submitting jobs to Condor

July 4, 2011

Read the condor_submit manual page for the full gory details.

Submitting jobs to Condor primarily happens through a file being passed to condor_submit. The file is properly called a submit description file. The submit file language is a simplified scripting language. The language outputs a ClassAd, called a job ad, that fully specifies the job — it’s inputs, requirements, policy, how it runs, its data, how its data is handled, credentials required, execution time, etc.

The submit file language consists of a set of commands, macro substitutions and implicit iteration. Nearly all commands in the language can be omitted, and mostly sane defaults will be provided. Conditionals can either be handled while writing the submit file (by a human or script), or for evaluation during scheduling, matching and executing of the job. condor_submit does not evaluate conditionals during ClassAd generation.

The most basic submit file is,

executable = myprogram
queue

It consists of just two commands: executable, which defines and performs some basic checks on the program to run; and, queue, which states that a job should be queued, based off the current job ad.

Based off the current job ad? Yes, a submit file is read top to bottom, each command helps to form the job ad that will be queued. For instance,

executable = myprogram
queue
arguments = a b c
queue

This submit file will queue two jobs, one that executes “myprogram” and one that executes “myprogram a b c”. It is pretty simple.

The basic commands you will care about 98% of the time are: executable, arguments, output, error, input, log, and, unfortunately, should_transfer_files and when_to_transfer_output.

executable you already know, it defines the executable to be run as part of the job. arguments is already clear, it defines the arguments given to the executable. output specifies the file where the executable’s stdout is written. error specifies the file where the executable’s stderr is written. log specifies a file where condor writes information about the execution of the job, i.e. where/when it was submitted, where/when it ran, how it exited, etc. log is optional like perl’s -w.

Sane defaults, that are not actually defaults in 7.6, for should_transfer_files and when_to_transfer_output are IF_NEEDED and ON_EXIT, respectively.

The macro substitutions are also available top to bottom and must be defined before the first queue command where they are used. Macro definitions look like commands and commands can be treated as macros. Macro substitution happens with $().

mybaseargs = base
executable = myprogram
arguments = $(mybaseargs) a
queue
arguments = $(mybaseargs) $(executable) b
queue

This will submit two jobs executing “myprogram base a” and “myprogram base myprogram b”. “mybaseargs = base” defines the macro “mybaseargs” and allows for the substitution with “$(mybaseargs)”. executable is the familiar command, but also allows the substitution “$(executable)”.

executable = myprogram
arguments = $(mybaseargs) a
mybaseargs = base
queue

Also works, but the following will not.

executable = myprogram
arguments = $(mybaseargs) a
queue
mybaseargs = base

The macro must be defined before the queue command where it is used.

Implicit iteration is driven by the queue command. The queue command actually takes an argument, a number. This is the number of times the current job ad will be queued.

executable = myprogram
queue 2

This creates two jobs, both running “myprogram”. Boring.

This might not be very useful, many copies of the same command, were there no way to make each queued job slightly different. condor_submit maintains two special macros, think of them as loop indices, and queue performs the iteration. $(Process) and $(Cluster) are the indices. $(Process) is zero-based. It starts at 0 and proceeds to one minus the argument to queue. $(Cluster) is the next available identifier available from the Schedd. It could be any 32 bit number, and may be incremented during the submission (mostly if you have multiple executable commands). Typically $(Process) is generally helpful. $(Cluster) is useful if you plan to submit the same file multiple times.

executable = myprogram
arguments = $(Process)
queue 2

This creates two jobs, “myprogram 0″ and “myprogram 1″. Pulling it all together,

$ cat > example.sub
command = echo
executable = /bin/$(command)
arguments = $(Process)
log = $(command).log
output = $(command).$(Cluster).$(Process).out
error = $(command).$(Cluster).$(Process).out
queue 2
^D

$ condor_submit example.sub
Submitting job(s)..
2 job(s) submitted to cluster 88.

$ cat echo.88.0.out
0
$ cat echo.88.1.out
1
$ cat echo.log
000 (088.000.000) 07/04 08:00:80 Job submitted from host: 
...
000 (088.001.000) 07/04 08:00:80 Job submitted from host: 
...
001 (088.000.000) 07/04 08:00:80 Job executing on host: 
...
001 (088.001.000) 07/04 08:00:80 Job executing on host: 
...
005 (088.000.000) 07/04 08:00:80 Job terminated.
	(1) Normal termination (return value 0)
		Usr 0 00:00:00, Sys 0 00:00:00  -  Run Remote Usage
		Usr 0 00:00:00, Sys 0 00:00:00  -  Run Local Usage
		Usr 0 00:00:00, Sys 0 00:00:00  -  Total Remote Usage
		Usr 0 00:00:00, Sys 0 00:00:00  -  Total Local Usage
	0  -  Run Bytes Sent By Job
	0  -  Run Bytes Received By Job
	0  -  Total Bytes Sent By Job
	0  -  Total Bytes Received By Job
...
005 (088.001.000) 07/04 08:00:80 Job terminated.
	(1) Normal termination (return value 0)
		Usr 0 00:00:00, Sys 0 00:00:00  -  Run Remote Usage
		Usr 0 00:00:00, Sys 0 00:00:00  -  Run Local Usage
		Usr 0 00:00:00, Sys 0 00:00:00  -  Total Remote Usage
		Usr 0 00:00:00, Sys 0 00:00:00  -  Total Local Usage
	0  -  Run Bytes Sent By Job
	0  -  Run Bytes Received By Job
	0  -  Total Bytes Sent By Job
	0  -  Total Bytes Received By Job
...

One more advanced feature is the ability to output directly into the job ad. Remember each command performs some action and incrementally constructs the job ad. The executable command does some file checks and then inserts the Cmd attribute into the job ad.

$ condor_q -format "Cmd = %s\n" Cmd 88.0
Cmd = /bin/echo

The direct insertion works with a +. It starts to demonstrate the descriptive power underlying Condor, and also requires some knowledge of the underlying ClassAd language. For instance, jobs can be arbitrarily attributed, even with attributes Condor will never use directly. The types of the values are important, mostly, strings must be quoted.

executable = myprogram
+Submission = "Submission A"
queue
+Submission = "Submission B"
queue 2

This submits three jobs, all with an attribute named Submission, two with a value “Submission B” and one with “Submission A”.

$ condor_q -format "%d " ProcId -format "%s\n" Submission
0 Submission A
1 Submission B
2 Submission B

$ condor_q -constraint 'Submission == "Submission A"'
-- Submitter: eeyore.local :  : eeyore.local
 ID      OWNER            SUBMITTED     RUN_TIME ST PRI SIZE CMD
  89.0   matt            7/04 08:08   0+00:00:00 I  0   0.0  myprogram

Notice, the $(Process) macro actually corresponds to the ProcId attribute on the job ad.

Concurrency Limits: Protecting shared resources

June 27, 2011

Concurrency Limits, sometimes called resource limits, are Condor‘s way of giving administrators and users a tool to protect limited resources.

A popular resource to protect is a software license. Take for example jobs that run Matlab. Matlab uses flexlm and users often have a limited number of licenses available, effectively limiting how many jobs they can run concurrently. Condor does not and does not need to integrate with flexlm here. Condor lets a user specify concurrency_limits = matlab with their job and administrators to add MATLAB_LIMIT = 64 to configuration.

Other uses include limiting the number of jobs connecting to network filesystem filer, limiting the number of jobs a user can be running, limiting the number of jobs running in a submission, and really anything else that can be managed at a global pool level. I have also heard of people using them to limit database connections and implement a global pool load share.

The global aspect of these resources is important. Concurrency limits are not local to nodes, e.g. for GPU management. Limits are managed by the Negotiator. They work because jobs contain a list of their limits and slot advertisements contain a list of active limits. During the negotiation cycle, the negotiator can sum up the active limits and compare with the configured maximum and what a job is requesting.

Also, limits are not considered in preemption decisions. Changes to limits on a running job, via qedit, will not impact the job until it stops. This means a job cannot give up a limit it no longer needs when it exits a certain phase of execution – consider DAGs here. And, lowering a limit via configuration will not result in job preemption.

By example,

First the configuration needs to be on the Negotiator, e.g.

$ condor_config_val -dump | grep LIMIT
CONCURRENCY_LIMIT_DEFAULT = 3
SLEEP_LIMIT = 1
AWAKE_LIMIT = 2

This says that there can be a maximum of 1 job using the SLEEP resources at a time. This is across all users and all accounting groups.

$ cat > limits.sub
cmd = /bin/sleep
args = 1d
concurrency_limits = sleep
queue
^D

$ condor_submit -a 'queue 4' limits.sub
Submitting job(s)...
3 job(s) submitted to cluster 41.

$ condor_q
-- Submitter: eeyore.local :  : eeyore.local
 ID      OWNER            SUBMITTED     RUN_TIME ST PRI SIZE CMD
  41.0   matt            7/04 12:21   0+00:55:55 R  0   4.2  sleep 1d
  41.1   matt            7/04 12:21   0+00:00:00 I  0   0.0  sleep 1d
  41.2   matt            7/04 12:21   0+00:00:00 I  0   0.0  sleep 1d
  41.3   matt            7/04 12:21   0+00:00:00 I  0   0.0  sleep 1d
4 jobs; 3 idle, 1 running, 0 held

(A) $ condor_q -format "%s " GlobalJobId -format "%s " ConcurrencyLimits -format "%s" LastRejMatchReason -format "\n" None
eeyore.local#41.0 sleep
eeyore.local#41.1 sleep concurrency limit reached
eeyore.local#41.2 sleep
eeyore.local#41.3 sleep

(B) $ condor_status -format "%s " Name -format "%s " GlobalJobId -format "%s" ConcurrencyLimits -format "\n" None
slot1@eeyore.local eeyore.local#41.0 sleep
slot2@eeyore.local
slot3@eeyore.local
slot4@eeyore.local

(A) shows each job wants to use the sleep limit. It also shows that job 41.1 did not match because its concurrency limits were reached. (B) shows that only 41.0 got to run, on slot1. Notice, the limit is present on the slot’s ad.

The Negotiator can also be asked about active limits directly,

$ condor_userprio -l | grep ConcurrencyLimit
ConcurrencyLimit_sleep = 1.000000

That’s well and good, but there are three more things to know about: 0) the default maximum, 1) multiple limits, 2) duplicate limits.

First, the default maximum, CONCURRENCY_LIMIT_DEFAULT, apply to any limit that is not explicitly named in configuration, as SLEEP was.

$ condor_submit -a 'concurrency_limits = biff' -a 'queue 4' limits.sub
Submitting job(s)....
4 job(s) submitted to cluster 42.

$ condor_rm 41
Cluster 41 has been marked for removal.

$ condor_q
-- Submitter: eeyore.local :  : eeyore.local
 ID      OWNER            SUBMITTED     RUN_TIME ST PRI SIZE CMD
  42.0   matt            7/04 12:34   0+00:00:22 R  0   0.0  sleep 1d
  42.1   matt            7/04 12:34   0+00:00:22 R  0   0.0  sleep 1d
  42.2   matt            7/04 12:34   0+00:00:22 R  0   0.0  sleep 1d
  42.3   matt            7/04 12:34   0+00:00:00 I  0   0.0  sleep 1d
8 jobs; 4 idle, 4 running, 0 held

$ condor_q -format "%s " GlobalJobId -format "%s " ConcurrencyLimits -format "%s" LastRejMatchReason -format "\n" None
eeyore.local#42.0 biff
eeyore.local#42.1 biff
eeyore.local#42.2 biff
eeyore.local#42.3 biff concurrency limit reached

Second, a job can require multiple limits at the same time. The job will need to consume each limit to run, and the most restricted limit will dictate if the job runs.

$ condor_rm -a
All jobs marked for removal.

$ condor_submit -a 'concurrency_limits = sleep,awake' -a 'queue 4' limits.sub
Submitting job(s)....
4 job(s) submitted to cluster 43.

$ condor_q
-- Submitter: eeyore.local :  : eeyore.local
 ID      OWNER            SUBMITTED     RUN_TIME ST PRI SIZE CMD
  43.0   matt            7/04 13:07   0+00:00:13 R  0   0.0  sleep 1d
  43.1   matt            7/04 13:07   0+00:00:00 I  0   0.0  sleep 1d
  43.2   matt            7/04 13:07   0+00:00:00 I  0   0.0  sleep 1d
  43.3   matt            7/04 13:07   0+00:00:00 I  0   0.0  sleep 1d
4 jobs; 3 idle, 1 running, 0 held

$ condor_q -format "%s " GlobalJobId -format "%s " ConcurrencyLimits -format "%s" LastRejMatchReason -format "\n" None
eeyore.local#43.0 awake,sleep
eeyore.local#43.1 awake,sleep concurrency limit reached
eeyore.local#43.2 awake,sleep
eeyore.local#43.3 awake,sleep

$ condor_status -format "%s " Name -format "%s " GlobalJobId -format "%s" ConcurrencyLimits -format "\n" None
slot1@eeyore.local eeyore.local#43.0 awake,sleep
slot2@eeyore.local
slot3@eeyore.local
slot4@eeyore.local

Only one job gets to run because even though there are two awake limits available, there is only one sleep available.

Finally, a job can require more than one of the same limit. In fact, the requirement can be fractional.

$ condor_rm -a
All jobs marked for removal.

$ condor_submit -a 'concurrency_limits = sleep:2.0' -a 'queue 4' limits.sub
Submitting job(s)....
4 job(s) submitted to cluster 44.

$ condor_submit -a 'concurrency_limits = awake:2.0' -a 'queue 4' limits.sub
Submitting job(s)....
4 job(s) submitted to cluster 45.

$ condor_q
-- Submitter: eeyore.local :  : eeyore.local
 ID      OWNER            SUBMITTED     RUN_TIME ST PRI SIZE CMD
  44.0   matt            7/04 13:11   0+00:00:00 I  0   0.0  sleep 1d
  44.1   matt            7/04 13:11   0+00:00:00 I  0   0.0  sleep 1d
  44.2   matt            7/04 13:11   0+00:00:00 I  0   0.0  sleep 1d
  44.3   matt            7/04 13:11   0+00:00:00 I  0   0.0  sleep 1d
  45.0   matt            7/04 13:13   0+00:00:24 R  0   0.0  sleep 1d
  45.1   matt            7/04 13:13   0+00:00:00 I  0   0.0  sleep 1d
  45.2   matt            7/04 13:13   0+00:00:00 I  0   0.0  sleep 1d
  45.3   matt            7/04 13:13   0+00:00:00 I  0   0.0  sleep 1d
8 jobs; 7 idle, 1 running, 0 held


$ condor_q -format "%s " GlobalJobId -format "%s " ConcurrencyLimits -format "%s" LastRejMatchReason -format "\n" None
eeyore.local#44.0 sleep:2.0 concurrency limit reached
eeyore.local#44.1 sleep:2.0
eeyore.local#44.2 sleep:2.0
eeyore.local#44.3 sleep:2.0
eeyore.local#45.0 awake:2.0
eeyore.local#45.1 awake:2.0 concurrency limit reached
eeyore.local#45.2 awake:2.0
eeyore.local#45.3 awake:2.0

$ condor_userprio -l | grep Limit
ConcurrencyLimit_awake = 2.000000
ConcurrencyLimit_sleep = 0.0

Here none of the jobs in cluster 44 will run, they each need more SLEEP than is available. Also, only one of the jobs in cluster 45 can run at a time, because each one uses up all the AWAKE when it runs.

Getting Started: Multiple node Condor pool with firewalls

June 21, 2011

Creating a Condor pool with no firewalls up is quite a simple task. Before the condor_shared_port daemon, doing the same with firewalls was a bit painful.

Condor uses dynamic ports for everything except the Collector. The Collector endpoint is the bootstrap. This means a Schedd might start up on a random ephemeral port, and each of its shadows might as well. This causes headaches for firewalls as large ranges of ports need to be opened for communication. There are ways to control the ephemeral range used. Unfortunately, doing so just reduced the port range some, did not guarantee Condor was on the ports, and could limit scale.

The condor_shared_port daemon allows Condor to use a single inbound port on a machine.

Again, using Fedora 15. I had no luck with firewalld and firewall-cmd. Instead I fell back to using straight iptables.

The first thing to do is pick a port for Condor to use on your machines. The simplest thing to do is pick 9618, the port typically known as the Collector’s port.

On all machines where Condor is going to run, you want to -

# lokkit --enabled

# service iptables start
Starting iptables (via systemctl):  [  OK  ]

# service iptables status
Table: filter
Chain INPUT (policy ACCEPT)
num  target     prot opt source               destination
1    ACCEPT     all  --  0.0.0.0/0            0.0.0.0/0           state RELATED,ESTABLISHED
2    ACCEPT     icmp --  0.0.0.0/0            0.0.0.0/0
3    ACCEPT     all  --  0.0.0.0/0            0.0.0.0/0
4    REJECT     all  --  0.0.0.0/0            0.0.0.0/0 reject-with icmp-host-prohibited

Chain FORWARD (policy ACCEPT)
num  target     prot opt source               destination
1    REJECT     all  --  0.0.0.0/0            0.0.0.0/0 reject-with icmp-host-prohibited

Chain OUTPUT (policy ACCEPT)
num  target     prot opt source               destination

If you want to ssh to the machine again, be sure to insert rules above the “REJECT ALL — …” -

# iptables -I INPUT 4 -p tcp -m tcp --dport 22 -j ACCEPT

And open a port, both TCP and UDP, for the shared port daemon -

# iptables -I INPUT 5 -p tcp -m tcp --dport condor -j ACCEPT
# iptables -I INPUT 6 -p udp -m udp --dport condor -j ACCEPT

Next you want to configure Condor to use the shared port daemon, with port 9618 -

# cat > /etc/condor/config.d/41shared_port.config
SHARED_PORT_ARGS = -p 9618
DAEMON_LIST = $(DAEMON_LIST), SHARED_PORT
COLLECTOR_HOST = $(CONDOR_HOST)?sock=collector
USE_SHARED_PORT = TRUE
^D

In order, SHARED_PORT_ARGS tells the shared port daemon to listen on port 9618, DAEMON_LIST tells the master to start the shared port daemon, COLLECTOR_HOST specifies that the collector will be on the sock named “collector”, and finally USE_SHARED_PORT tells all daemons to register and use the shared port daemon.

After you put that configuration on all your systems, run service condor restart, and go.

You will have the shared port daemon listening on 9618 (condor), and all communication between machines will around through it.

# lsof -i | grep $(pidof condor_shared_port)
condor_sh 31040  condor    8u  IPv4  74105      0t0  TCP *:condor (LISTEN)
condor_sh 31040  condor    9u  IPv4  74106      0t0  UDP *:condor

That’s right, you have a condor pool with firewalls and a single port opened for communication on each node.


Follow

Get every new post delivered to your Inbox.

Join 44 other followers

%d bloggers like this: