Archive for September, 2012

Advanced scheduling: Execute in the future with job deferral

September 24, 2012

One advanced scheduling feature of Condor is the ability to set a time, in the future, when a job should be run. This is called a deferral time.

Using the deferral_time command, you simply specify a time, in seconds since EPOCH, when your job should run:

executable = /bin/date
log = deferral.log
output = deferral.out
error = deferral.err

deferral_time = 1357016400

queue

Use date(1) to generate the deferral_time.

$ date -d @1357016400
Tue Jan  1 00:00:00 EST 2013
$ date +%s -d "2013-01-01 00:00:00"
1357016400

After submitting the job and waiting until 1 Jan 2013, you can see the result by looking in deferral.log and deferral.out.

$ grep ^00 deferral.log
000 (001.000.000) 08/15 22:33:00 Job submitted from host: <127.0.0.1:56006>
001 (001.000.000) 01/01 00:00:00 Job executing on host: <127.0.0.2:57590>
006 (001.000.000) 01/01 00:00:00 Image size of job updated: 75
005 (001.000.000) 01/01 00:00:00 Job terminated.

$ cat deferral.out
Tue Jan  1 00:00:00 EST 2013

Of course there is no guarantee that a resource will be available at a precise time in the future. A job that does not run at its deferral_time will be put on Hold for manual intervention.

To reduce the likelihood of missing the deferral_time and needing manual intervention, the deferral_prep_time and deferral_window commands are available. Respectively, they specify the amount of time before the deferral_time that the job can be matched with a resource and how long after the deferral_time execution is acceptable.

executable = /bin/date
log = deferral.log
output = deferral.out
error = deferral.err

deferral_time = 1357016400

# 1 day = 24 hour * 60 min * 60 sec = 86,400 seconds
# 1/2 day = 86,400 sec / 2 = 43,200 seconds
deferral_prep_time = 86400
deferral_window = 43200

queue

In the example above, the job may be matched to a resource, where it will keep the resource Claimed/Busy for up to a day (deferral_prep_time) in advance of its actual run. This will make it more likely that the job will run at precisely the deferral_time. It also means that for accounting purposes, you will be charged for using the resource, though the job has not yet run.

Additionally, if the job is not matched or otherwise does not start at precisely deferral_time, it has half a day (deferral_window) to run before it is put on hold for manual intervention.

That’s it.

MATLAB jobs and Condor

September 17, 2012

The question of how to run MATLAB jobs with Condor comes up from time to time and there is no central location for the knowledge.

If you want to use MATLAB with Condor,

Read http://sysadminhelp.cms.caltech.edu/AOK/HTC/, it provides an informative overview and detailed example, including how to use mcc to create a standalone executable.

Or read http://www.liv.ac.uk/csd/escience/condor/matlab/

Or read http://www.lehigh.edu/computing/hpc/running/condor_matlab.html

Possibly read https://condor-wiki.cs.wisc.edu/index.cgi/wiki?p=HowToRunMatlab

Likely just enter “MATLAB condor” into your favorite search engine and you will find useful results.

The Owner state

September 11, 2012

What is the Owner state? Why are my slots in the Owner state? Why do jobs not start immediately after I restart Condor? How do I keep slots from going into the Owner state?

$ condor_status
Name               OpSys      Arch   *State*  Activity LoadAv Mem   ActvtyTime
eeyore.local       LINUX      X86_64 *Owner*  Idle     0.480  7783  0+00:00:04
                     Machines Owner Claimed Unclaimed Matched Preempting
        X86_64/LINUX        1     1       0         0       0          0
               Total        1     1       0         0       0          0

All these questions require understanding a bit of Condor policy. Specifically, the START policy on execute nodes.

In Condor, the decision to run a job on a resource, a.k.a. a slot, is made by both the job and the resource. The job specifies the requirements it has for a resource, e.g. Memory > 1024. And, the resource specifies the requirements it has for jobs, e.g. MemoryUsage < Memory. If both do not agree, the job will not be run on the resource.

First, the Owner state is an indication that the resource is not available to run jobs. The resource is in use by its Owner. Historically, this has been someone at the resource’s console. Generally, think of Owner meaning Unavailable.

Second, the resource specifies its requirements using the START configuration parameter.

$ condor_config_val -v START
START: ( (KeyboardIdle > 15 * 60) && ( ((LoadAvg - CondorLoadAvg) <= 0.3) || (State != "Unclaimed" && State != "Owner")) )
  Defined in '/etc/condor/condor_config', line 754.

When the START policy evaluates to False, i.e. starting jobs is not allowed, the resource will appear in the Owner state.

Note that the default START policy does not care about aspects of the jobs that might run. It cares entirely about aspects of the resource itself. It is designed to protect a user at the console (KeyboardIdle) and other processes already running on the machine (LoadAvg – CondorLoadAvg).

Third, jobs may not start immediately on restart because the KeyboardIdle timer has been reset to 0. This means waiting 15 minutes for the START policy, specifically KeyboardIdle > 15 * 60, to evaluate to True and the resource to become available.

Finally, on dedicated resources, which are very common, the KeyboardIdle component of the START policy can be removed. In fact, it is quite common to simply set START = TRUE.

Maintaining tight accounting group quota usage with preemption

September 4, 2012

Many metrics in a distributed system should be viewed over time. However, desires are often for instantaneous views. Take for example usage data.

Usage data in Condor is commonly viewed through accounting groups. Accounting groups can be given quotas that specify what percent of a pool each can use.

Instead of viewing accounting group usage summed over a week or a day or even hours, it is common to want any snapshot of usage to match configured quotas. At the same time, it is common to want accounting groups to exceed their quota if resources are available. These two desires are at odds.

Erik Erlandson has written about how to use preemption to maintain tight accounting group quota usage.

Instantaneous views should not be relied on, but with preemption it is possible to reduce the window you sum over.

Now, if preemption is not an option, the desire to see usage snapshots matching configured quotas must be abandoned.


%d bloggers like this: