We’ve been running a Couchbase server as a back-end for a memcache ticket registry in our Jasig CAS implementation for some time. It started as an easy way to run a memcache server on Windows, but we also realized that it would probably be a nice way to implement a redundant cluster for the authentication service.
Jasig CAS (www.jasig.org/cas), Central Authentication Service, is the wide-spread enterprise single-sign-on service we use at KTH. It is the service known as login.kth.se to students and staff since many years.
Couchbase (www.couchbase.com), is an open-source, high-availability NoSQL database based on Erlang/OTP and its included mnesia database. Couchbase provides a memcache protocol interface and can be used as a memcached replacement. Using an amended, couchbase cluster aware, client, you can get the full benefits of the couchbase cluster redundancy, as well as indexing etc not available with pure memcache.
Due to my previous experience with Erlang/OTP to provide high-availability in large scale systems, I was fairly confident that a redundancy solution based on Couchbase would actually work in real life, so a solution for our authorization server using this technology seemed agreeable to me.
Couchbase module for CAS
In order to use the cluster functionality of Couchbase, we have to create our own, Couchbase-aware, ticket registry. While we are at it, we also have created a service registry storage backend for service verification in CAS. It is released as separate module which can be deployed with CAS using the recommended maven-overlay method of CAS customization and available at GitHub.
There are essentially two ways to use replication in Couchbase, either replicated vBuckets or Cross Datacenter Replication (XCDR). They can loosely be considered comparable to raid5 vs raid1 in a storage sence. Which to choose is not clear at this moment and we need to test this further.
Using XCDR seems clean from an architectural point of view, but it introduces the possibility of issues with the replicates not being in sync and session ticket validation from web applications failing since a web application may use a different server than the user browser is directed to. One way of solving this would be to use a hot-standby configuration so that only one server is active and in use at a time, and only redirecting to the standby server if the master fails.
Using vBucket replication would probably be considered to be the “normal” way of doing things with Couchbase. It involves the clients knowing more about the cluster and which node is the master of which data through a hashing algorithm which is updated if some server node fails. This is handled transparently by the Couchbase client and means that all clients would have a consistent view of the data. In this case, the Couchbase cluster should be seen as separate cluster from the CAS server. This is most likely the configuration we will look at primarily.
It is important to note that Couchbase replication traffic is not encrypted in Couchbase 2.0. The traffic between the nodes must be secured in some other manner, using IPSec or some other tunneling mechanism.
The project is still in development but the functionality is pretty much all there. We are currently in development testing and will soon start integration testing in our reference environment, looking at a deployment to our production environment sometime this spring. This deployment will however not include clustering.
Later this spring we will start playing with creation of a redundant login cluster in the reference environment for future deployment to production.