Getting started: Condor and EC2 – Importing instances with condor_ec2_link

Starting and managing instances describes the powerful feature of Condor to start and manage EC2 instances, but what if you are already using something other than Condor to start your instance, such as the AWS Management Console.

Importing instances turns out to be straightforward, if you know how instances are started. In a nutshell, the condor_gridmanager executes a state machine and records its current state in an attribute named GridJobId. To import an instance, submit a job that is already in the state where an instance id has been assigned. You can take a submit file and add +GridJobId = “ec2 https://ec2.amazonaws.com/ BOGUS INSTANCE-ID. The INSTANCE-ID needs to be the actual identifier of the instance you want to import. For instance,

...
ec2_access_key_id = ...
ec2_secret_access_key = ...
...
+GridJobId = "ec2 https://ec2.amazonaws.com/ BOGUS i-319c3652"
queue

It is important to get the ec2_access_key_id and ec2_secret_access_key correct. Without them Condor will not be able to communicate with EC2 and EC2_GAHP_LOG will report,

$ tail -n2 $(condor_config_val EC2_GAHP_LOG)
11/11/11 11:11:11 Failure response text was '
AuthFailureAWS was not able to validate the provided access credentialsab50f005-6d77-4653-9cec-298b2d475f6e'.

This error will not be reported back into the job, putting it on hold, instead the gridmanager will think the EC2 is down for the job. Oops.

$ grep down $(condor_config_val GRIDMANAGER_LOG)
11/11/11 11:11:11 [10697] resource https://ec2.amazonaws.com is now down
11/11/11 11:14:22 [10697] resource https://ec2.amazonaws.com is still down

To simplify the import, here is a script that will use ec2-describe-instances to get useful metadata about the instance and populate a submit file for you,

condor_ec2_link

#!/bin/sh

# Provide three arguments:
#  . instance id to link
#  . path to file with access key id
#  . path to file with secret access key

# TODO:
#  . Get EC2UserData (ec2-describe-instance-attribute --user-data)

ec2-describe-instances --show-empty-fields $1 | \
   awk '/^INSTANCE/ {id=$2; ami=$3; keypair=$7; type=$10; zone=$12; ip=$17; group=$29}
        /^TAG/ {name=$5}
        END {print "universe = grid\n",
                   "grid_resource = ec2 https://ec2.amazonaws.com\n",
                   "executable =", ami"-"name, "\n",
                   "log = $(executable).$(cluster).log\n",
                   "ec2_ami_id =", ami, "\n",
                   "ec2_instance_type =", type, "\n",
                   "ec2_keypair_file = name-"keypair, "\n",
                   "ec2_security_groups =", group, "\n",
                   "ec2_availability_zone =", zone, "\n",
                   "ec2_elastic_ip =", ip, "\n",
                   "+EC2InstanceName = \""id"\"\n",
                   "+GridJobId = \"$(grid_resource) BOGUS", id, "\"\n",
                   "queue\n"}' | \
      condor_submit -a "ec2_access_key_id = $2" \
                    -a "ec2_secret_access_key = $3"

In action,

$ ./condor_ec2_link i-319c3652 /home/matt/Documents/AWS/Cert/AccessKeyID /home/matt/Documents/AWS/Cert/SecretAccessKey
Submitting job(s).
1 job(s) submitted to cluster 1739.

$ ./condor_ec2_q 1739
 ID      OWNER            SUBMITTED     RUN_TIME ST PRI SIZE CMD
1739.0   matt           11/11 11:11   0+00:00:00 I  0   0.0 ami-e1f53a88-TheNa
  Instance name: i-319c3652
  Groups: sg-4f706226
  Keypair file: /home/matt/Documents/AWS/name-TheKeyPair
  AMI id: ami-e1f53a88
  Instance type: t1.micro
1 jobs; 1 idle, 0 running, 0 held

(20 seconds later)

$ ./condor_ec2_q 1739
 ID      OWNER            SUBMITTED     RUN_TIME ST PRI SIZE CMD
1739.0   matt           11/11 11:11   0+00:00:01 R  0   0.0 ami-e1f53a88-TheNa
  Instance name: i-319c3652
  Hostname: ec2-50-17-104-50.compute-1.amazonaws.com
  Groups: sg-4f706226
  Keypair file: /home/matt/Documents/AWS/name-TheKeyPair
  AMI id: ami-e1f53a88
  Instance type: t1.micro
1 jobs; 0 idle, 1 running, 0 held

There are a few things that can be improved here, the most notable of which is the RUN_TIME. The Gridmanager gets status data from EC2 periodically. This is how the EC2RemoteVirtualMachineName (Hostname) gets populated on the job. The instance’s launch time is also available. Oops.

Advertisements

Tags: , , , ,

One Response to “Getting started: Condor and EC2 – Importing instances with condor_ec2_link”

  1. Getting started: Condor and EC2 – EC2 execute node « Spinning Says:

    […] Spinning « Getting started: Condor and EC2 – Importing instances with condor_ec2_link […]

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s


%d bloggers like this: