Starting and managing instances describes the powerful feature of Condor to start and manage EC2 instances, but what if you are already using something other than Condor to start your instance, such as the AWS Management Console.
Importing instances turns out to be straightforward, if you know how instances are started. In a nutshell, the condor_gridmanager executes a state machine and records its current state in an attribute named GridJobId. To import an instance, submit a job that is already in the state where an instance id has been assigned. You can take a submit file and add +GridJobId = “ec2 https://ec2.amazonaws.com/ BOGUS INSTANCE-ID“. The INSTANCE-ID needs to be the actual identifier of the instance you want to import. For instance,
...
ec2_access_key_id = ...
ec2_secret_access_key = ...
...
+GridJobId = "ec2 https://ec2.amazonaws.com/ BOGUS i-319c3652"
queue
It is important to get the ec2_access_key_id and ec2_secret_access_key correct. Without them Condor will not be able to communicate with EC2 and EC2_GAHP_LOG will report,
$ tail -n2 $(condor_config_val EC2_GAHP_LOG)
11/11/11 11:11:11 Failure response text was '
AuthFailure
AWS was not able to validate the provided access credentialsab50f005-6d77-4653-9cec-298b2d475f6e'.
This error will not be reported back into the job, putting it on hold, instead the gridmanager will think the EC2 is down for the job. Oops.
$ grep down $(condor_config_val GRIDMANAGER_LOG)
11/11/11 11:11:11 [10697] resource https://ec2.amazonaws.com is now down
11/11/11 11:14:22 [10697] resource https://ec2.amazonaws.com is still down
To simplify the import, here is a script that will use ec2-describe-instances to get useful metadata about the instance and populate a submit file for you,
condor_ec2_link
#!/bin/sh
# Provide three arguments:
# . instance id to link
# . path to file with access key id
# . path to file with secret access key
# TODO:
# . Get EC2UserData (ec2-describe-instance-attribute --user-data)
ec2-describe-instances --show-empty-fields $1 | \
awk '/^INSTANCE/ {id=$2; ami=$3; keypair=$7; type=$10; zone=$12; ip=$17; group=$29}
/^TAG/ {name=$5}
END {print "universe = grid\n",
"grid_resource = ec2 https://ec2.amazonaws.com\n",
"executable =", ami"-"name, "\n",
"log = $(executable).$(cluster).log\n",
"ec2_ami_id =", ami, "\n",
"ec2_instance_type =", type, "\n",
"ec2_keypair_file = name-"keypair, "\n",
"ec2_security_groups =", group, "\n",
"ec2_availability_zone =", zone, "\n",
"ec2_elastic_ip =", ip, "\n",
"+EC2InstanceName = \""id"\"\n",
"+GridJobId = \"$(grid_resource) BOGUS", id, "\"\n",
"queue\n"}' | \
condor_submit -a "ec2_access_key_id = $2" \
-a "ec2_secret_access_key = $3"
In action,
$ ./condor_ec2_link i-319c3652 /home/matt/Documents/AWS/Cert/AccessKeyID /home/matt/Documents/AWS/Cert/SecretAccessKey
Submitting job(s).
1 job(s) submitted to cluster 1739.
$ ./condor_ec2_q 1739
ID OWNER SUBMITTED RUN_TIME ST PRI SIZE CMD
1739.0 matt 11/11 11:11 0+00:00:00 I 0 0.0 ami-e1f53a88-TheNa
Instance name: i-319c3652
Groups: sg-4f706226
Keypair file: /home/matt/Documents/AWS/name-TheKeyPair
AMI id: ami-e1f53a88
Instance type: t1.micro
1 jobs; 1 idle, 0 running, 0 held
(20 seconds later)
$ ./condor_ec2_q 1739
ID OWNER SUBMITTED RUN_TIME ST PRI SIZE CMD
1739.0 matt 11/11 11:11 0+00:00:01 R 0 0.0 ami-e1f53a88-TheNa
Instance name: i-319c3652
Hostname: ec2-50-17-104-50.compute-1.amazonaws.com
Groups: sg-4f706226
Keypair file: /home/matt/Documents/AWS/name-TheKeyPair
AMI id: ami-e1f53a88
Instance type: t1.micro
1 jobs; 0 idle, 1 running, 0 held
There are a few things that can be improved here, the most notable of which is the RUN_TIME. The Gridmanager gets status data from EC2 periodically. This is how the EC2RemoteVirtualMachineName (Hostname) gets populated on the job. The instance’s launch time is also available. Oops.