7 good reasons to open source your code

Open source software — software that is free to the user and whose source code can be modified and distributed freely — is ubiquitous and offers solutions to a wide range of needs. Open source as a business strategy is a complex mix of product, marketing, politics, and sales strategies. And, as a personal endeavor, open source can achieve some goals that are unavailable in a corporate environment.

Photo by Finn Hackshaw on Unsplash

There are good, not-so-good and bad reasons to offer your code as open source. The good reasons arise out of deeply considering the desired outcomes. Here are some of the good ones:

  1. Scratch your itch: You’ve looked far and wide for a solution to your problem and you’ve waited forever for someone to step forward to solve it. Whether it is home automation or a photo library manager or some neat hack that makes you more efficient at work, you have to figure that there’s other people with the same problem. Make your solution open source and like minded people may flock to your code and make it better. A note of warning: expect only a tiny percentage of users to contribute back. Your return on investment is usually non-monetary: goodwill and a feeling of having made the world a better place.
  2. Get distribution: Chances are that the problem your code is solving is widespread and there are a good number of strong solutions out there already. Open source software can circumvent the incumbent’s stranglehold on the market — the convenience of obtaining open source and the afforded legal freedoms are powerful incentives to users to use your (even if inferior) solution. If you have identified underserved consumers, you can target a good-enough open source solution that can get you a beachhead.
  3. Attack the competition’s value chain: Your software is likely part of a software stack used by your users to get a job done. For example, your software may rely on the underlying operating system APIs. By offering lower layers of the stack as free and open source you seek to do a few things: (a) offer a more convenient software stack, (b) make money on (proprietary, or as-a-service) higher layers, [c] commoditize the lower layers of the stack and [d] take away some of the pricing power / barriers-to-entry/lock-in wielded by the competitor. This strategy requires deep pockets and a lot of patience. Alternately, you might seek to establish a new abstraction layer or software standard on the competition’s API with your open source software.
  4. Attract talent to your company / show off your talent: Developers are scarce and top developers are rarer than hen’s teeth. Developers care about open source software for many reasons. Developers want to work with other top developers — whose work is only visible when open sourced. A developer’s contribution to open source code lives beyond her career at any one company. Open source code has associated communities that are valuable for finding new jobs and opportunities. By significantly contributing to existing open source communities or by open sourcing significant pieces of software, a company can burnish their reputation amongst developers of the open source bent.
  5. Make your culture more open: Programming is a social activity. Developers need to communicate with other developers about their code. When the code is owned by a small team, coding styles become idiosyncratic, documentation becomes opaque and testing is lower priority since team members develop their own unspoken understanding of each other’s code style. When code has to be made public to thousands of eyes, developers suddenly have to care that their code is complete, is easily readable, easy to change, and is well documented. This is usually a good thing and this culture should carry over to the proprietary parts of the code as well.
  6. Your competition is open source: Competitors will adopt an open source strategy when they are jockeying for leadership in an emerging space. You may have no choice but to follow suit.
  7. The market expects open source software: The market may be conditioned to expect easy-to-consume software. This is true for example, in the infrastructure space. That usually means open source code.

This article explored some of the ‘why’ of open source strategy. Once you decide that open source is the right strategy, it takes a great deal of knowledge (and patience) to do it right. There are innumerable ways to get it wrong. Few successful strategies are repeatable. And if you are hoping to make money off of your open source code, it can be a long and winding road with lots of forks to dead-ends. That is a topic for a different (and longer) article.

The Elephant in the Giraffe House

Photo by Julian Link on Unsplash

The Elephant in the Giraffe House

I listen to podcasts on my commute, and the a16z podcast never disappoints. On the episode “On Mentors and Mentees”, the guest Ken Coleman narrated an interesting fable that got me thinking. To paraphrase:

It was raining heavily but the Giraffe’s in the Giraffe House were snug and dry. Outside, the Elephant was caught in the driving rain. The Giraffes felt sorry for the Elephant and invited her into their house. The Elephant was grateful and was soon warm and dry.

After some time, the Elephant said — “I want to look outside, but the windows are too high”.

“Well, you just need to grow your neck” replied the Giraffes. “I heard there are stretches you could do for that”.

Then the Elephant complained — “It is such a tight squeeze in here, can’t we expand the building?”

“You really should lose some weight — have you thought about dieting?”

The fable (borrowed from “Building a House for Diversity”) is about Diversity and Inclusion. When we pay lip service to diversity, we invite the others in, but we continue doing things the old way instead of accommodating the needs of the others.

My first was reaction was to scoff at the Giraffes. But, was the Elephant fair in asking for accommodations? Was the response of the Giraffes appropriate? Do I feel like an Elephant in a Giraffe’s house sometimes? Do you do things at work that alien to your nature just to fit in? Do you ask for a better environment at your workplace? Let me know in the comments.

The Half Life of Programming Knowledge

Programming is hard. I relearned this lesson recently.

From time to time, I go on a decluttering spree at home, throwing / donating / recycling things I haven’t used in a while. Last week, the bottom row of my bookshelf caught my eye — because it had a number of programming books. I do write code as part of my job — though it isn’t the main part of my job description. But I hadn’t touched these books in years. I decided to recycle these:


While these books served as a reminder of past jobs and adventures (I have been an embedded systems programmer, built networking systems, designed large scale log processing and analytics and helped build a non-profit), they were quite obsolete. After a few minutes of consideration, I decided to get rid of these as well:


My library of programming books somewhat reflect my career. My masters thesis was on object oriented databases (written in C++). My career started in a organization that was in the midst of a massive software re-architecture. The architects there had been bitten by the object-orientation bug and it infected me as well. We used Smalltalk and Design Patterns. Later, I became a C++ aficionado, dabbling in template meta-programming (don’t ask!) but haven’t used C++ in over a decade. I learned just enough Perl to fix some obscure scripts that I only half-understood. Ruby was thrilling for a while — but like the rest of the industry, it has faded in interest for me. I’m not sure why I have the compiler book from my undergraduate studies — maybe I wanted to write a Domain Specific Language (DSL)?

I have retained a few books:


The C++ books are just there for nostalgia. I still work a lot in networking, although I don’t use these books. The pink book is ‘The Little Schemer’ — after Object Oriented Programming’s demise(?), perhaps only functional programming can save us (I also have a book on Clojure in my Kindle library). I remember trying to learn Scala, CoffeeScript and Lua and a few other languages.

My current programming languages are Python and Go (and bash !). For both, online references suffice.

The decluttering exercise reminded me of how hard it is to learn programming, how much I have forgotten and how much I didn’t actually learn. I also realized that I have not bought any books on programming recently. There are many reasons for this:

  • Great free content. Thanks to open source, I can now grok a piece of code in a language new to me and somewhat understand it. When I started my career, the only code one could read was the code written by colleagues. Thanks to communities like StackOverflow, I can learn from other programmers faster than I can learn from a book.
  • Software is disposable. Although I like to think that I write great software, the truth is that a good piece of it is never-or-under used. Admittedly, it is a pleasure to write and read idiomatic code, and I cringe when Python code is not PEP8 compliant, but in truth, it only matters some of the time. Going deep into a particular programming language doesn’t have the payoff.
  • Programming languages go through fads. It was always interesting to debate the pros-and-cons of C++ vs Java vs .Net vs Javascript, but in the end it rarely mattered (counter-point). Does it matter if your code is highly optimized if it never reaches the market?
  • As a generalist, I have to switch languages every day. It is hard to get deep into one language when the other languages in my toolbox have different APIs and frameworks to do the same thing. As a result, a quick reference (Google!) tool gets the job done better than a book.

Knowledge decays very quickly in the software field. If I were to advise my younger self, I’d urge younger-me to pay less attention to language wars, less focus on software elegance and less time on shiny new languages (Groovy, Clojure, Scala, CoffeeScript…). That doesn’t mean one should learn only ONE language. Rather, when faced with a new language, one should learn how the old patterns apply, figure out what is new and see if it can quickly be put to productive use.

WORA! WORA! WORA !— the Kubernetes stealth attack

Tora!Tora!Tora! was the code word used to signify complete surprise had been achieved in the attack on Pearl Harbor. Photo Credit: Ebdon on Wikimedia

WORA — Write Once, Run Anywhere was the term coined by Sun Microsystems in the mid-1990’s to signify that Java apps were portable. Byte code generated by the Java compiler could run on any standards-compliant Java Virtual Machine on any device, independent of the operating system. The byte-code idea wasn’t new, however the strategy was to make the OS (specifically, Microsoft Windows) and the hardware it ran on irrelevant. (Sun made all its profits from hardware, so this was a little strange).

The Java WORA strategy yielded a great deal of drama, including lawsuits and counter-suits and was part of the DOJ’s investigations into Microsoft’s anti-trust activities. In other words, it was a big deal. It was also successful in the sense that it prevented Microsoft from dominating the software stack for enterprise applications.

When Kubernetes was open-sourced by Google, it was clear to me that this was a similar strategy — aimed against the dominant cloud player Amazon Web Services. The business of a business is to make money, so it is unlikely that Google gave away a great piece of software out of the goodness of its heart. I tweeted as much:

The Kubernetes API (tries to) makes the underlying (proprietary) IaaS APIs irrelevant. Write to the Kubernetes API and get automatic portability. Technically, this is different from past efforts to abstract away the cloud APIs (e.g., Jclouds, or RightScale) with another API. Kubernetes brings a level of automation and cloud-nativity that makes even on-premises datacenters cloud-native.

Google, in its just-concluded Cloud Next ’19 conference made this explicit — the phrase “write once, run anywhere” was used in the keynotes and other sessions. The Anthos platform makes this even easier by taking the toil out of running Kubernetes on-premises.

Microsoft’s counter-strategy towards Java was “Embrace, extend, and extinguish”. Embrace Java, extend it with proprietary capabilities and then kill it by killing its portability. Sun sued Microsoft over this strategy, and following the anti-trust settlements with the DoJ, the Java standard prevailed. So far, the competition (Pivotal, RedHat, AWS, Microsoft Azure) have embraced Kubernetes. By donating the software to the Cloud Native Computing Foundation which certifies Kubernetes distributions, Google is betting that competitors such as AWS will not be able to kill the portability of Kubernetes with proprietary extensions.

At the same time, Google wants to make money with its Cloud business — unlike Sun Microsystems, which disappeared into obscurity despite the popularity of Java. There is an official way to extend Kubernetes — by extending its API using Custom Resource Definitions (CRDs). The Istio and Knative projects from Google leverage CRDs to offer value on top of standard Kubernetes. While these are open source, they are not (yet, maybe never?) under control of the CNCF. Istio and Knative are the differentiation that Google can use to provide a superior Kubernetes solution. This is what Anthos is — a platform that simplifies Kubernetes while adding value with Traffic Director (Istio) and CloudRun (Knative).

Despite its propensity to kill popular products, Google is also capable of playing the long game. Consider the Chrome Browser — it started with a miniscule marketshare 11 years ago, and is now the dominant browser. GSuite is another successful monetisation of a freemium product. Whether Anthos will survive as long and be as successful as Chrome is to be seen.

Japanese Admiral Hara Tadaichi summed up the Pearl Harbor attack by saying, “We won a great tactical victory at Pearl Harbor and thereby lost the war.” We don’t know how the current battle between the cloud giants will pan out, but it is sure to be entertaining without the destruction¹ and bloodshed of the Pacific Theatre.


[1] Kubernetes has left a swath of destruction among startups that had offerings similar to Kubernetes.

Design patterns in orchestrators: transfer of desired state (part 3/N)

Most datacenter automation tools operate on the basis of desired state. Desired state describes what should be the end state but not how to get there. To simplify a great deal, if the thing being automated is the speed of a car, the desired state may be “60mph”. How to get there (braking, accelerator, gear changes, turbo) isn’t specified. Something (an “agent”) promises to maintain that desired speed.

desiredstate

The desired state and changes to the desired state are sent from the orchestrator to various agents in a datacenter. For example, the desired state may be “two apache containers running on host X”. An agent on host X will ensure that the two containers are running. If one or more containers die, then the agent on host X will start enough containers to bring the count up to two. When the orchestrator changes the desired state to “3 apache containers running on host X”, then the agent on host X will create another container to match the desired state.

Transfer of desired state is another way to achieve idempotence (a problem described here)

We can see that there are two sources of changes that the agent has to react to:

  1. changes to desired state sent from the orchestrator and
  2. drift in the actual state due to independent / random events.

Let’s examine #1 in greater detail. There’s a few ways to communicate the change in desired state:

  1. Send the new desired state to the agent (a “command” pattern). This approach works most of the time, except when the size of the state is very large. For instance, consider an agent responsible for storing a million objects. Deleting a single object would involve sending the whole desired state (999999 items). Another problem is that the command may not reach the agent (“the network is not reliable”). Finally, the agent may not be able to keep up with rate of change of desired state and start to drop some commands.  To fix this issue, the system designer might be tempted to run more instances of the agent; however, this usually leads to race conditions and out-of-order execution problems.
  2. Send just the delta from the previous desired state. This is fraught with problems. This assumes that the controller knows for sure that the previous desired state was successfully communicated to the agent, and that the agent has successfully implemented the previous desired state. For example, if the first desired state was “2 running apache containers” and the delta that was sent was “+1 apache container”, then the final actual state may or may not be “3 running apache containers”. Again, network reliability is a problem here. The rate of change is an even bigger potential problem here: if the agent is unable to keep up with the rate of change, it may drop intermediate delta requests. The final actual state of the system may be quite different from the desired state, but the agent may not realize it! Idempotence in the delta commands helps in this case.
  3. Send just an indication of change (“interrupt”). The agent has to perform the additional step of fetching the desired state from the controller. The agent can compute the delta and change the actual state to match the delta. This has the advantage that the agent is able to combine the effects of multiple changes (“interrupt debounce”). By coalescing the interrupts, the agent is able to limit the rate of change. Of course the network could cause some of these interrupts to get “lost” as well. Lost interrupts can cause the actual state to diverge from the desired state for long periods of time. Finally, if the desired state is very large, the agent and the orchestrator have to coordinate to efficiently determine the change to the desired state.
  4. The agent could poll the controller for the desired state. There is no problem of lost interrupts; the next polling cycle will always fetch the latest desired state. The polling rate is critical here: if it is too fast, it risks overwhelming the orchestrator even when there are no changes to the desired state; if too slow, it will not converge the the actual state to the desired state quickly enough.

To summarize the potential issues:

  1. The network is not reliable. Commands or interrupts can be lost or agents can restart / disconnect: there has to be some way for the agent to recover the desired state
  2. The desired state can be prohibitively large. There needs to be some way to efficiently but accurately communicate the delta to the agent.
  3. The rate of change of the desired state can strain the orchestrator, the network and the agent. To preserve the stability of the system, the agent and orchestrator need to coordinate to limit the rate of change, the polling rate and to execute the changes in the proper linear order.
  4. Only the latest desired state matters. There has to be some way for the agent to discard all the intermediate (“stale”) commands and interrupts that it has not been able to process.
  5. Delta computation (the difference between two consecutive sets of desired state) can sometimes be more efficiently performed at the orchestrator, in which case the agent is sent the delta. Loss of the delta message or reordering of execution can lead to irrecoverable problems.

A persistent message queue can solve some of these problems. The orchestrator sends its commands or interrupts to the queue and the agent reads from the queue. The message queue buffers commands or interrupts while the agent is busy processing a desired state request.  The agent and the orchestrator are nicely decoupled: they don’t need to discover each other’s location (IP/FQDN). Message framing and transport are taken care of (no more choosing between Thrift or text or HTTP or gRPC etc).

messageq

There are tradeoffs however:

  1. With the command pattern, if the desired state is large, then the message queue could reach its storage limits quickly. If the agent ends up discarding most commands, this can be quite inefficient.
  2. With the interrupt pattern, a message queue is not adding much value since the agent will talk directly to the orchestrator anyway.
  3. It is not trivial to operate / manage / monitor a persistent queue. Messages may need to be aggressively expired / purged, and the promise of persistence may not actually be realized. Depending on the scale of the automation, this overhead may not be worth the effort.
  4. With an “at most once” message queue, it could still lose messages. With  “at least once” semantics, the message queue could deliver multiple copies of the same message: the agent has to be able to determine if it is a duplicate. The orchestrator and agent still have to solve some of the end-to-end reliability problems.
  5. Delta computation is not solved by the message queue.

OpenStack (using RabbitMQ) and CloudFoundry (using NATS) have adopted message queues to communicate desired state from the orchestrator to the agent.  Apache CloudStack doesn’t have any explicit message queues, although if one digs deeply, there are command-based message queues simulated in the database and in memory.

Others solve the problem with a combination of interrupts and polling – interrupt to execute the change quickly, poll to recover from lost interrupts.

Kubernetes is one such framework. There are no message queues, and it uses an explicit interrupt-driven mechanism to communicate desired state from the orchestrator (the “API Server”) to its agents (called “controllers”).

Courtesy of Heptio

(Image courtesy: https://blog.heptio.com/core-kubernetes-jazz-improv-over-orchestration-a7903ea92ca)

Developers can use (but are not forced to use) a controller framework to write new controllers. An instance of a controller embeds an “Informer” whose responsibility is to watch for changes in the desired state and execute a controller function when there is a change. The Informer takes care of caching the desired state locally and computing the delta state when there are changes. The Informer leverages the “watch” mechanism in the Kubernetes API Server (an interrupt-like system that delivers a network notification when there is a change to a stored key or value). The deltas to the desired state are queued internally in the Informer’s memory. The Informer ensures the changes are executed in the correct order.

  • Desired states are versioned, so it is easier to decide to compute a delta, or to discard an interrupt.
  • The Informer can be configured to do a periodic full resync from the orchestrator (“API Server”) – this should take care of the problem of lost interrupts.
  • Apparently, there is no problem of the desired state being too large, so Kubernetes does not explicitly handle this issue.
  • It is not clear if the Informer attempts to rate-limit itself when there are excessive watches being triggered.
  • It is also not clear if at some point the Informer “fast-forwards” through its queue of changes.
  • The watches in the API Server use Etcd watches in turn. The watch server in the API server only maintains a limited set of watches received from Etcd and discards the oldest ones.
  • Etcd itself is a distributed data store that is more complex to operate than say, an SQL database. It appears that the API server hides the Etcd server from the rest of the system, and therefore Etcd could be replaced with some other store.

I wrote a Network Policy Controller for Kubernetes using this framework and it was the easiest integration I’ve written.

It is clear that the Kubernetes creators put some thought into the architecture, based on their experiences at Google. The Kubernetes design should inspire other orchestrator-writers, or perhaps, should be re-used for other datacenter automation purposes. A few issues to consider:

  • The agents (“controllers”) need direct network reachability to the API Server. This may not be possible in all scenarios, needing another level of indirection
  • The API server is not strictly an orchestrator, it is better described as a choreographer. I hope to describe this difference in a later blog post, but note that the API server never explicitly carries out a step-by-step flow of operations.

Design patterns in Orchestrators (part 2 of N) – southbound APIs

This is part of a series on design patterns in building orchestration systems. The focus is on orchestrators found in clouds, data-centers, networking systems, etc, but the principles should be broadly applicable.

In a previous post we touched on an important issue that is a side effect of performing orchestration over a network: idempotent operations. The communication between the controller and the subsystems is sometimes called “southbound”, while the API offered by the controller is called “northbound”. Of course the northbound API could be the southbound API for a uber-controller and the southbound API could be the northbound API of a smaller system.

Specifying the contract between the controller and the south subsystem produces a tension between good enough and perfect. A REST API is sometimes used to write the specification since it implies specific requirements on the various verbs. For example the PUT and GET operations have to be idempotent while a POST need not be. However the system architect may have to posit idempotent properties for the POST operations as well as described previously. The REST API endpoint for the subsystem can also provide monitoring and other operational data useful to the system operator.

A drawback of specifying a REST API for a subsystem is that it tends to make the orchestrator hierarchical. The subsystem REST API implementation itself becomes a mini controller that needs its own locus of operations. A REST API can also be rigid, making it hard to evolve – this can be problematic especially in the early phases of system design. Finally, what could be a single hop between the controller and the subsystem, becomes two hops – this increases the operational burden.

orchestration2

Famously, the OpenStack project specifies REST APIs for various subsystems (Cinder – block device, Neutron – network subsystem). Subsystem vendors implement drivers/plugins that implement the southbound API.  The driver implementation could in turn call vendor REST APIs or other southbound APIs (e.g., OpenFlow, NetConf, SNMP) to various devices. Driver implementations are often mini-controllers that maintain the desired state of the subsystem in a persistent/durable store.

 

An alternate model is to specify the southbound API in the programming language of the controller, for example, as a Java interface (e.g., Apache CloudStack), or an OSGI plugin (OpenDaylight, ONOS).

orchestration3

The driver / plugin is responsible for translating the API call into the specific subsystem API call. For example hypervisor plugins for each hypervisor (XenServer, KVM, VMWare) in Apache CloudStack use the respective hypervisor APIs (XAPI, libvirt, vSphere API) to implement the hypervisor plugin API. Plugins in this case can use the persistent store of the main controller. A drawback of this approach is that the plugin has to be written in the language of the controller. Installing/upgrading a new plugin in a running production system may also produce some downtime. Last but not least, the system architect must be vigilant that the driver / plugin code not call back into the controller or directly use/modify the controller’s state store.

Adding support for a new vendor is easier in the the REST southbound API model – the vendor just has to provide an implementation translating the southbound REST API to the vendor’s API in a language of their choice. However, the addition of layers complicates operations, troubleshooting and upgrades.

The in-process plugin model of Apache CloudStack, OpenDaylight, etc., makes it easier to install and get a system operational. A single locus of operations also makes it easier to test, operate, troubleshoot and and upgrade. Developing a plugin, however, is more complicated since it requires a knowledge of the developer tooling used to develop the controller.

Design patterns in orchestrators (part 1 of n) – idempotent operations

Orchestration is a somewhat overloaded term in the context of automation. Generally, it implies a central controller that tries to bring a complicated system to a desired state. There are usually a large number of subsystems that the controller manages. Changing the state of the system involves communicating with the subsystems in order to get them to change their state. The communication usually happens over a network.orchestration1

As a simple example, consider a home automation controller that is trying to get the home ready to receive its occupants by:

  1. Setting the indoor temperature by setting the thermostats
  2. Opening the garage door
  3. Turning on lights
  4. Turning on the tea kettle

The network however is unreliable. There are several failure modes to consider:

  1. The message from the controller may never reach the subsystem. Usually the subsystem acknowledges the control messages from the controller. The controller may implement a timeout so that if the subsystem never gets the message, the controller times out waiting for the acknowledgement and executes some kind of recovery
  2. The message may reach the subsystem but the subsystem is not ready or not in a state to process it. The controller will get a negative acknowledgement in this case and needs to execute another kind of fault recovery procedure.
  3. The message reaches the subsystem and the subsystem executes the requested control, but fails to complete the requested task. For example, it may request a downstream subsystem to execute a task, but that downstream subsystem fails (again, perhaps due to the network). The controller may or may not get a different negative acknowledgement in this case. The subsystem may even fail midway through the task.
  4. The subsystem gets the message, executes the task perfectly, but the acknowledgement never reaches the controller. The controller usually times out and executes some kind of fault recovery procedure.

Distinguishing between these kinds of failures at the controller is a little hard. If there is a timeout, it can’t determine if the subsystem performed the requested task or not. A common recovery procedure is to re-try the command to the subsystem. Within this recovery mode, the controller has to decide:

  • how many retries
  • how long to retry
  • when to alert a human

Depending on the semantics of the task, there are different answers. Consider an orchestration flow where the controller has to set up a virtual machine. The tasks involved could be to allocate storage, program network elements such as switches, routers and DHCP servers, choose hypervisor hosts and so on. Any of these tasks could fail. Retrying indefinitely to allocate storage when there is not enough storage available doesn’t make sense. Retrying because there was a timeout might make sense. Alerting a human when there are hundreds or thousands of subsystems being modified doesn’t scale – it is better to design recoverability into the system.

When the controller re-tries the command to the subsystem, it is possible to have an unexpected effect. Let’s say the storage subsystem in the virtual machine example did allocate the storage as requested the first time, but the controller didn’t receive the acknowledgement. The controller retries the command, resulting in double allocation at the storage system.

The solution in this case is to ensure that the commands from the controller to the subsystem are idempotent. That is, executing the same command multiple times produces the same result. The trick is uniquely identify the change that is being requested. The subsystem stores/remembers the identifier so that if the change is re-requested, it doesn’t re-do the change. The identifier can be opaque (i.e., the structure or contents of the id have no semantics, like a uuid ) or be derived from the state description sent to the subsystem (e.g., a file name). Opaque identifiers help avoid  leaky abstractions between the controller and the subsystem. In many cases the subsystem cannot be modified to be idempotent (e.g., proprietary systems, different admin space), so a non-opaque identifier has to be used. Examples include fully-qualified domain names, filesystem paths and IP addresses.

The idempotency trick helps in another corner case: where the subsystem reboots / re-initializes or gets recreated due to a failure: it may not know the last command / desired state sent by the controller. For example, consider the case of the home automation system where a defective thermostat is replaced. The new thermostat contacts the home automation controller. The controller re-sends the last control command. Since the new thermostat doesn’t have a record of the unique identifier in the command, it applies the change requested by the command.

A complex system with many subsystems and resources is constantly changing state independent of the controller. For example, hosts reboot, network switches go down, disks fail, and so on. The controller has to detect when the system has drifted from the desired state and then execute compensating commands to the subsystems to bring them back to the desired state. Having idempotent commands with unique identifiers is crucial to this recovery.

Architects of orchestration controllers often discover the need for idempotent operations well after implementation is in production. Since the controller usually in turn offers an API, the system architect has to ensure that this “northbound” API also supports idempotent commands / operations. Even Amazon Web Services (AWS) introduced idempotent run-instances quite late in the game (2010).

A (round)trip with Java and Go

Can you generate Go code from a Java binary jar? This need arose from writing a prototype Ingress Controller  for Kubernetes that uses a NetScaler to provide the Ingress function. Another need was a Terraform driver for NetScaler. (While the NetScaler Ingress controller didn’t need a Golang client, it is customary to write Kubernetes integrations using Go).

NITRO is the REST API to program the Citrix NetScaler load balancer. The API is easy to use with well-defined usage patterns. Most commonly, JSON is used to create/update/read/delete configuration on the NetScaler.  There are Java and Python clients, (as well as PowerShell and Perl), but I needed a Golang client to NITRO.

There were a few roadblocks to producing the Go client. One problem is that the NITRO API is vast (over 1000 config objects with corresponding JSON definitions). Second, the JSON  documentation is in HTML docs, or, one has to resort to tools like Postman to reverse engineer the JSON schema.

After studying the NITRO Java SDK source, it occurred to me that  each config object in the REST API had a corresponding Java class. For example com.citrix.netscaler.nitro.resource.config.lb.lbvserver had fields that represented the possible fields in the JSON config object to configure a lbvserver 

The task was to generate a JSON v4 schema from the NITRO Java SDK. This was relatively easy after finding JJSchema . This project generates a JSON schema from Java classes. There were a few changes required to make JJSchema work with the NITRO jar: JJSchema assumed that all fields had an accessor of the form getFoo(): In the case of NITRO, it was get_foo(). JJSchema also relies on field annotations (@Attributes) to figure out metadata such as enums and read-only. For enums, the Java classes in the NITRO package had inner classes with static member constants, and figuring out the readonly attribute was a matter of finding fields that didn’t have set_foo() methods. The resulting changes are in a fork: https://github.com/chiradeep/JJSchema. The code to invoke JJSchema is in https://github.com/chiradeep/json-nitro. This involves determining all the subclasses of com.citrix.netscaler.nitro.resource.config and invoking the forked JJSchema on those classes.

Getting from JSON schema to Go involves another open source project (Generate). As the blurb says, Generate generates Go Structs from JSON schema. The generated schema is used in a (somewhat incomplete) Go client to the NITRO API (https://github.com/chiradeep/go-nitro).

Producing the JSON schema has the nice side-effect that it should be easier to write new clients (Ruby/Javascript anyone?) to the NITRO API. Who knows which language will catch the fancy of systems geeks a year from now (Rust?).

 

 

Consul-template and Citrix Netscaler

Consul-template (consul-template) is a tool that can drive reconfiguration of applications and infrastructure in response to changes in the keys/values stored in Consul. Usually it is used to populate changes into a local file in the filesystem. Following the change, the application or infrastructure software is usually restarted.

Previously I’d written up integrations between container managers such as Kubernetes and Netscaler. It was a relatively simple matter to include support for consul-template with a slight tweak. Netscaler only supports a REST-based configuration API (“Nitro“), so populating a config file on Netscaler was not going to do the job. The solution was simple: write a JSON file using consul-template and then ask consul-template to execute a python script to convert the JSON to Nitro API calls.

So:

consul-template -consul $CONSUL_IP:8500 -template consul_single_svc.ctmpl:cfg.json:"python main.py --cfg-file cfg.json

Here cfg.json is the intermediate JSON file produced by consul-template.

You can get the code here

Apple’s iCloud is a multi-cloud beast

Apple device users have probably taken and stored 100 billion photos:

  • In early 2013, the number was 9 billion
  • There are 100 million iPhones in active use in 2015. If each iPhone takes 1000 pictures per year, that’s 100 billion photos in 2015 alone.
  • Photos are automatically backed up to iCloud since iOS 5

Update: This article estimates the number of active iPhones at 700 Million (March 2017). During WWDC 2017, Apple revealed that iPhone users take 1 trillion photos a year.

I’d assumed that iCloud is a massive compute and storage cloud, operated like the datacenters of Google and Amazon.

Turns out that, at least for photo storage, iCloud is actually composed of Amazon’s S3 storage service and Google’s Cloud Storage service. I serendipitously discovered this while copying some photos from my camera’s SD card to my Macbook using the native Photos app. I’d recently installed  ‘Little Snitch‘ to see why the camera light on my Macbook turns on for no reason. Little Snitch immediately alerted me that Photos was trying to connect to Amazon’s S3 and Google’s Cloud Storage:

So it looks like Apple is outsourcing iCloud storage to two different clouds. At first glance this is strange: AWS S3 promises durability of 99.999999999%, so backing up to Google gains very little reliability for a doubling of cost.

It turns out that that AWS S3 and Google Storage are used differently:

For the approximately 200 hi-res photos that I was copying from my camera’s SD card, AWS S3 stores a LOT (1.58 GB), while Google stores a measly 50 MB. So Apple is probably using Google for something else. Speculation:

AWS S3 has an SLA of 99.99%. For the cases where it is unavailable (but photos are still safe), Google can be used to store / fetch low-res versions of the Photo stream.

The Google location could also be used to store an erasure code, although from the size, it seems unlikely.

Apple charges me $2.99 per month (reduced from $3.99 per month last fall) for 200GB of iCloud storage. Apple should be paying (according to the published pricing) between $2.50 and $5.50 per month to Amazon AWS for this. Add in a few pennies for Google’s storage, they are are probably break-even or slightly behind. If they were to operate their own S3-like storage, they would probably make a small -to- medium profit instead. I’ve calculated some numbers based on 2 MB per iPhone image.

Profit/Loss
per TB per month
2 PB – 1 billion
photos
20 PB – 10
billion photos
200 PB – 100
billion photos
2000 PB – 1
trillion photos
-$5 -$10,000 -$100,000 -$1,000,000 -$10,000,000
-$10 -$20,000 -$200,000 -$2,000,000 -$20,000,000
$10 $20,000 $200,000 $2,000,000 $20,000,000
$20 $40,000 $400,000 $4,000,000 $40,000,000

Given Apple’s huge profits of nearly $70 billion per year, paying Amazon about a quarter a billion for worry-free, infinitely scalable storage seems worth it.

I haven’t included the cost of accessing the data from S3, which can be quite prohibitive, but I suspect that Apple uses a content delivery network (CDN) for delivering the photos to your photo stream.

Untitled

Multi-cloud is clearly not a mythical beast. It is here and big companies like Apple are already taking advantage of it.