The coordinator module allows you to use Juju’s leadership feature to coordinate operations between units of a service.
Behavior is defined in subclasses of coordinator.BaseCoordinator. One implementation is provided (coordinator.Serial), which allows an operation to be run on a single unit at a time, on a first come, first served basis. You can trivially define more complex behavior by subclassing BaseCoordinator or Serial.
|author:||Stuart Bishop <firstname.lastname@example.org>|
Services Framework Usage¶
Ensure a peers relation is defined in metadata.yaml. Instantiate a BaseCoordinator subclass before invoking ServiceManager.manage(). Ensure that ServiceManager.manage() is wired up to the leader-elected, leader-settings-changed, peers relation-changed and peers relation-departed hooks in addition to any other hooks you need, or your service will deadlock.
Ensure calls to acquire() are guarded, so that locks are only requested when they are really needed (and thus hooks only triggered when necessary). Failing to do this and calling acquire() unconditionally will put your unit into a hook loop. Calls to granted() do not need to be guarded.
from charmhelpers.core import hookenv, services from charmhelpers import coordinator def maybe_restart(servicename): serial = coordinator.Serial() if needs_restart(): serial.acquire('restart') if serial.granted('restart'): hookenv.service_restart(servicename) services = [dict(service='servicename', data_ready=[maybe_restart])] if __name__ == '__main__': _ = coordinator.Serial() # Must instantiate before manager.manage() manager = services.ServiceManager(services) manager.manage()
You can implement a similar pattern using a decorator. If the lock has not been granted, an attempt to acquire() it will be made if the guard function returns True. If the lock has been granted, the decorated function is run as normal:
from charmhelpers.core import hookenv, services from charmhelpers import coordinator serial = coordinator.Serial() # Global, instatiated on module import. def needs_restart(): [ ... Introspect state. Return True if restart is needed ... ] @serial.require('restart', needs_restart) def maybe_restart(servicename): hookenv.service_restart(servicename) services = [dict(service='servicename', data_ready=[maybe_restart])] if __name__ == '__main__': manager = services.ServiceManager(services) manager.manage()
Ensure a peers relation is defined in metadata.yaml.
If you are using charmhelpers.core.hookenv.Hooks, ensure that a BaseCoordinator subclass is instantiated before calling Hooks.execute.
If you are not using charmhelpers.core.hookenv.Hooks, ensure that a BaseCoordinator subclass is instantiated and its handle() method called at the start of all your hooks.
import sys from charmhelpers.core import hookenv from charmhelpers import coordinator hooks = hookenv.Hooks() def maybe_restart(): serial = coordinator.Serial() if serial.granted('restart'): hookenv.service_restart('myservice') @hooks.hook def config_changed(): update_config() serial = coordinator.Serial() if needs_restart(): serial.acquire('restart'): maybe_restart() # Cluster hooks must be wired up. @hooks.hook('cluster-relation-changed', 'cluster-relation-departed') def cluster_relation_changed(): maybe_restart() # Leader hooks must be wired up. @hooks.hook('leader-elected', 'leader-settings-changed') def leader_settings_changed(): maybe_restart() [ ... repeat for *all* other hooks you are using ... ] if __name__ == '__main__': _ = coordinator.Serial() # Must instantiate before execute() hooks.execute(sys.argv)
You can also use the require decorator. If the lock has not been granted, an attempt to acquire() it will be made if the guard function returns True. If the lock has been granted, the decorated function is run as normal:
from charmhelpers.core import hookenv hooks = hookenv.Hooks() serial = coordinator.Serial() # Must instantiate before execute() @require('restart', needs_restart) def maybe_restart(): hookenv.service_restart('myservice') @hooks.hook('install', 'config-changed', 'upgrade-charm', # Peers and leader hooks must be wired up. 'cluster-relation-changed', 'cluster-relation-departed', 'leader-elected', 'leader-settings-changed') def default_hook(): [...] maybe_restart() if __name__ == '__main__': hooks.execute()
A simple API is provided similar to traditional locking APIs. A lock may be requested using the acquire() method, and the granted() method may be used do to check if a lock previously requested by acquire() has been granted. It doesn’t matter how many times acquire() is called in a hook.
Locks are released at the end of the hook they are acquired in. This may be the current hook if the unit is leader and the lock is free. It is more likely a future hook (probably leader-settings-changed, possibly the peers relation-changed or departed hook, potentially any hook).
Whenever a charm needs to perform a coordinated action it will acquire() the lock and perform the action immediately if acquisition is successful. It will also need to perform the same action in every other hook if the lock has been granted.
Why do you need to be able to perform the same action in every hook? If the unit is the leader, then it may be able to grant its own lock and perform the action immediately in the source hook. If the unit is the leader and cannot immediately grant the lock, then its only guaranteed chance of acquiring the lock is in the peers relation-joined, relation-changed or peers relation-departed hooks when another unit has released it (the only channel to communicate to the leader is the peers relation). If the unit is not the leader, then it is unlikely the lock is granted in the source hook (a previous hook must have also made the request for this to happen). A non-leader is notified about the lock via leader settings. These changes may be visible in any hook, even before the leader-settings-changed hook has been invoked. Or the requesting unit may be promoted to leader after making a request, in which case the lock may be granted in leader-elected or in a future peers relation-changed or relation-departed hook.
This could be simpler if leader-settings-changed was invoked on the leader. We could then never grant locks except in leader-settings-changed hooks giving one place for the operation to be performed. Unfortunately this is not the case with Juju 1.23 leadership.
But of course, this doesn’t really matter to most people as most people seem to prefer the Services Framework or similar reset-the-world approaches, rather than the twisty maze of attempting to deduce what should be done based on what hook happens to be running (which always seems to evolve into reset-the-world anyway when the charm grows beyond the trivial).
I chose not to implement a callback model, where a callback was passed to acquire to be executed when the lock is granted, because the callback may become invalid between making the request and the lock being granted due to an upgrade-charm being run in the interim. And it would create restrictions, such no lambdas, callback defined at the top level of a module, etc. Still, we could implement it on top of what is here, eg. by adding a defer decorator that stores a pickle of itself to disk and have BaseCoordinator unpickle and execute them when the locks are granted.
Acquire the named lock, non-blocking.
The lock may be granted immediately, or in a future hook.
Returns True if the lock has been granted. The lock will be automatically released at the end of the hook in which it is granted.
Do not mindlessly call this method, as it triggers a cascade of hooks. For example, if you call acquire() every time in your peers relation-changed hook you will end up with an infinite loop of hooks. It should almost always be guarded by some condition.
Maybe grant the lock to a unit.
The decision to grant the lock or not is made for $lock by a corresponding method grant_$lock, which you may define in a subclass. If no such method is defined, the default_grant method is used. See Serial.default_grant() for details.
Return True if a previously requested lock has been granted
Emit a message. Override to customize log spam.
released(unit, lock, timestamp)¶
Called on the leader when it has released a lock.
By default, does nothing but log messages. Override if you need to perform additional housekeeping when a lock is released, for example recording timestamps.
Return the timestamp of our outstanding request for lock, or None.
Returns a datetime.datetime() UTC timestamp, with no tzinfo attribute.
Return True if we are in the queue for the lock
require(lock, guard_func, *guard_args, **guard_kw)¶
Decorate a function to be run only when a lock is acquired.
The lock is requested if the guard function returns True.
The decorated function is called if the lock has been granted.
default_grant(lock, unit, granted, queue)¶
Default logic to grant a lock to a unit. Unless overridden, only one unit may hold the lock and it will be granted to the earliest queued request.
To define custom logic for $lock, create a subclass and define a grant_$lock method.
unit is the unit name making the request.
granted is the set of units already granted the lock. It will never include unit. It may be empty.
queue is the list of units waiting for the lock, ordered by time of request. It will always include unit, but unit is not necessarily first.
Returns True if the lock should be granted to unit.