Archive for the ‘Scheduling’ Category

Social scheduling

November 26, 2012

As a thought experiment.

There are always multiple users and limited resources. Users have work, which takes time and resources to complete.

The top resource users are visible to all.

A user can relinquish resources she is using.

A relinquished resource, either by work completing or by user action, is reassigned randomly.

How would this not work?

How would you refine it?

No longer thinking in slots, thinking in aggregate resources and consumption policies

November 13, 2012

The slot model was natural when a machine housed a single core. Though, the slot model did not exist when a machine housed a single core.

When machines were single core the model was a machine, represented as a MachineAd. A MachineAd had an associated CPU, some nominal amount of RAM and some chunk of disk space. Running a job meant consuming a machine.

When machines grew multiple cores the machine model was split. A single machine became independent MachineAds, called virtual machines. However, the name didn’t stick as the term virtual machine became a popular term in hardware virtualization. So a machine became independent MachineAds, called slots. The unifying entity, the machine itself, was lost. Running a job still meant consuming a slot.

Most recently, slots split into two classes: static and partitionable. Static slots are the slots formerly known as virtual machines. Partitionable slots are a representation of the physical machine itself, and are carved up, on-demand to service jobs. Both types are still MachineAds, but the consumption of partitionable slots is dynamic.

The slot model has demonstrated great utility but has been stretched.

In this time workloads have also changed. They have become more memory bound, disk IO bound, and network bound. They have started relying on specialized hardware and even application level services. They have started both spanning and packing into cores. They have grown complex data dependencies, become very short running, and become infrastructure level long running.

Machines have also grown to include scores of cores, hundreds of gigabytes of RAM, dozens of terabytes of disk, specialized hardware such as GPUs, co-processors, entropy keys, high speed interconnects and a bevy of other attached devices.

Machines are lumpy, heterogeneous means more than operating system and CPU architecture.

Furthermore, if it still existed, the machine model itself would fail to cleanly describe available resources. Classes of resources exist that house entire clusters, grids, or life-cycle manageable application services. Resources share addressable memory across operating systems instances, are custom architectures across whole data centers, and even those that don’t provide an outline of their capacity. Resources may grow and shrink while in use.

Consumption of these resources is not necessarily straightforward or uniform.

It’s time to stop thinking in slots. Its time to start thinking in aggregate resources and their consumption policies.


%d bloggers like this: