This is part of a series on design patterns in building orchestration systems. The focus is on orchestrators found in clouds, data-centers, networking systems, etc, but the principles should be broadly applicable.
In a previous post we touched on an important issue that is a side effect of performing orchestration over a network: idempotent operations. The communication between the controller and the subsystems is sometimes called “southbound”, while the API offered by the controller is called “northbound”. Of course the northbound API could be the southbound API for a uber-controller and the southbound API could be the northbound API of a smaller system.
Specifying the contract between the controller and the south subsystem produces a tension between good enough and perfect. A REST API is sometimes used to write the specification since it implies specific requirements on the various verbs. For example the PUT and GET operations have to be idempotent while a POST need not be. However the system architect may have to posit idempotent properties for the POST operations as well as described previously. The REST API endpoint for the subsystem can also provide monitoring and other operational data useful to the system operator.
A drawback of specifying a REST API for a subsystem is that it tends to make the orchestrator hierarchical. The subsystem REST API implementation itself becomes a mini controller that needs its own locus of operations. A REST API can also be rigid, making it hard to evolve – this can be problematic especially in the early phases of system design. Finally, what could be a single hop between the controller and the subsystem, becomes two hops – this increases the operational burden.
Famously, the OpenStack project specifies REST APIs for various subsystems (Cinder – block device, Neutron – network subsystem). Subsystem vendors implement drivers/plugins that implement the southbound API. The driver implementation could in turn call vendor REST APIs or other southbound APIs (e.g., OpenFlow, NetConf, SNMP) to various devices. Driver implementations are often mini-controllers that maintain the desired state of the subsystem in a persistent/durable store.
The driver / plugin is responsible for translating the API call into the specific subsystem API call. For example hypervisor plugins for each hypervisor (XenServer, KVM, VMWare) in Apache CloudStack use the respective hypervisor APIs (XAPI, libvirt, vSphere API) to implement the hypervisor plugin API. Plugins in this case can use the persistent store of the main controller. A drawback of this approach is that the plugin has to be written in the language of the controller. Installing/upgrading a new plugin in a running production system may also produce some downtime. Last but not least, the system architect must be vigilant that the driver / plugin code not call back into the controller or directly use/modify the controller’s state store.
Adding support for a new vendor is easier in the the REST southbound API model – the vendor just has to provide an implementation translating the southbound REST API to the vendor’s API in a language of their choice. However, the addition of layers complicates operations, troubleshooting and upgrades.
The in-process plugin model of Apache CloudStack, OpenDaylight, etc., makes it easier to install and get a system operational. A single locus of operations also makes it easier to test, operate, troubleshoot and and upgrade. Developing a plugin, however, is more complicated since it requires a knowledge of the developer tooling used to develop the controller.