Subsystem and Daemon confusion

Condor has a notion of a Subsystem for customizing configuration between daemons. This is conflated with the notion of a Daemon.

Condor’s Master runs programs, we’ll call Daemons, specified by DAEMON_LIST, e.g. DAEMON_LIST = MASTER, STARTD. Condor’s tools let you manipulate Daemons, e.g. condor_on -subsystem STARTD, condor_restart -subsystem STARTD. Wait, manipulate daemons with a -subsystem argument?

Take the SHADOW_STARTD example, DAEMON_LIST = MASTER, STARTD, SHADOW_STARTD. We know both are part of the same subsystem, STARTD, because they are both the condor_startd executable. It is perfectly reasonable to think that running “condor_restart -subsystem STARTD” will restart both the STARTD and the SHADOW_STARTD. After all, they are both part of the STARTD subsystem.

That’s not what will happen. The SHADOW_STARTD will not be restarted.

Historically, the name of a daemon in the DAEMON_LIST has mapped one to one with subsystems. The code does not, and should not, enforce this. The -subsystem argument is just misleading. It should be -daemon.

condor_restart -daemon STARTD,SHADOW_STARTD

could restart both daemons.

Right now (7.4),

condor_restart -subsystem STARTD
condor_restart -subsystem SHADOW_STARTD

will restart both.

Now, the expected behavior of -subsystem may actually be a desirable feature. But probably not with the name -subsystem. The feature that is really desirable is a grouping of daemons. The information to do it properly is not easily accessible to the condor_master though. Providing daemon-centric configuration could make it possible, e.g.

STARTD_GROUP = STARTDS
SHADOW_STARTD_GROUP = STARTDS

And then,

condor_restart -group STARTDS

Side note, why does MASTER have to be in the DAEMON_LIST? The condor_master will bail if it is missing. Probably to avoid special case code paths, since condor_on/off/restart -subsystem MASTER works too.

Tags: ,

Leave a comment