While Salvatore and I have discussed some significant changes to various aspects of Sentinel, enough to likely warrant a Sentinel version change, I wanted to capture some of the more incremental proposals we talked about during this year's #redislondon dev day.

Preliminaries

First up I want to mention some changes to Sentinel Salvatore is already working on. Currently Sentinel opens a connection to each "logical master" from each logical master on a Sentinel. At scale, this becomes less desirable and Salvatore is working on a single-connection to each other Sentinel which is multiplexed for the logical masters. This will greatly reduce the connection counts needed at the Sentinel layer, and I am all in favor of it.

Changes Proposed

Propagate Pod Name

In tandem with the metadata part of the config changes there are some changes Sentinel could do to integrate with these and improve things. Specifically Sentinel should update masters and slaves with the "pod" or "service" name given in Sentinel. My thinking here is to close the loop on discovery in Sentinel.

For this you would be able to check the master or slave you just to connected to and check the pod it belongs to to validate you connected to the one you expected. It also allows you to reverse-discover the pod, which is very useful for operations. The ability of operational teams to fully discover the complete setup is very useful and powerful for system auditing as well as being handy for troubleshooting.

Indeed, Salvatore brilliantly noted that with this capability would allow Sentinel to do some santiy checking natively. Specifically Sentinel would be able to query a new master for an existing pod name, and if found cross-check to ensure you are not adding the same master under a different name. Further on new slaves being added to a master this meta-value could also be checked thus helping to ensure slaves are not associated with multiple pods - at least in Sentinel.

Authentication & Discovery

Last year there was discussion on the mailing list about authentication in Sentinel. As it happens you can set requirepass in the config file for Sentinel. But this is a bad thing to do. Do this and you can not connect to and use Sentinel as Sentinel does not accept the auth command. Of course it turns out that adding the command, though trivial to do, doesn't really make it better because naturally Sentinel doesn't send the auth command when connecting to other Sentinels.

However, while it would need done it shouldn't be difficult to add this. But why would you want to? To improve discovery, of course!

If we had the option to require authentication to Sentinel, we could expose a command to get the password associated with a pod, or better yet provide it with the existing connection information. With this combination you could use Sentinel for complete discovery. You could configure the client with the Sentinel address and password, then use the connection information and password returned to connect to the instance. But how is that different?

A related discussion on the mailing list has been around password rotation. If you need to change your instance's password for any reason it means coordinating with client-side changes - which are likely in code not in config files. With this method of password discovery you can change the password in Redis and Sentinel, and simply disconnect clients to trigger a rediscovery of the needed information with no config or code changes needed on the client.

Another way of utilizing this method with more controllable timing is to change the password in a slave, then do a failover in Sentinel to promote it. As this automatically triggers a client disconnect it makes the process a bit simpler and more controllable.

Of course, I would not want to be able to get the password for a pod without needing one to get to Sentinel. Sure, it's plain text but why make it too easy to get that information? So this is perhaps the trickiest bit - only making that information if and only if there is a password requirement on the Sentinel.

Another possibility is for Sentinel to detect a change in the pod's password and trigger the master and slaves to disconnect clients so they rediscovery the correct password.

Sentinel Name OR Metadata?

To go along with the Redis node name and metadata proposals, we should bring this into Sentinel. Sentinel now has a persistent ID - as distinct from the RunID - so this may not affect that aspect. The metadata option would bring the same tagging and classification abilities we would provide to Redis to Sentinel instances.

For example, for those who run dedicated Sentinels for each pod they manage you could tag the Sentinels with that information. More broadly you could tag Sentinels with location or user information such as datacenter, cluster, availability zone, business group, etc.. With this type of functionality Sentinel would be much more manageable in larger deployments and provide much better discovery for Redis and Sentinel management tools.

Sentinel Replication Calls

Currently if you want to remove a pod from Sentinel you have to iterate over each Sentinel which knows of the pod and remove it there. While this seems trivial at lower scale, and easy to code for as a developer, for operations this can become nightmarish when you are dealing with hundreds or thousands of these cases. Thus, there are certain commands which would benefit from a replication type mechanism - implemented by adding an optional keyword to the sentinel remove command: sync.

If you've read the configuration change entry this will look familiar. I'll start with the given example of removing a pod from all Sentinel management.

sentinel remove devcache sync

When calling this command the Sentinel you connected to would remove it locally, reach out to the other known Sentinels for 'devcache' and instruct them to remove it as well. This would, however, require the receiving Sentinel to detect it was getting told by another Sentinel and not then sync it out itself - though if the calling Sentinel didn't pass sync this would work out just fine as it.

Alternatively we could use a new command: sentinel purge <name> which could do this.

In either case, Sentinel would need to flush it's output buffers first to try to eliminate message overlap, as well as report back if there were any errors in the process. For example say the third Sentinel it needed to talk to was not reachable at that time. It would report back what Sentinel had errors. A Sentinel not having the logical master would not be treated as an error.

This also means this command is essentially synchronous in that it has to wait until all known Sentinels have replied. This is another reason to ensure a "client Sentinel" doesn't try to then reach out and potentially cause a deadlock.

Other per-pod calls should also have such a sync option. Imagine changing parallel-syncs or auth-pass. It is essentially the same problem and same solution. We add sync as an optional parameter to sentinel set <pod> <key> <value>. An argument could be made to make sentinel set always push it's changes to known-slaves, however. I have not yet thought of a situation where, outside of a remove command, you would not want or need the changes propagated to known-Sentinels.

On a related note, I want to bring back up the notions of sentinel remove slave as a desirable option to remove a slave from Sentinel's known slaves for a pod as opposed to resetting the entre pod. This is a safer way to remove old slaves from a pod.

For handling the case of removing a single Sentinel from the other Sentinels' know-Sentinels list, Sentinel should publish a goodbye message to indicated it is intentionally going away. This message would be published the same way, and to the same location, as the hello message. Any Sentinel upon seeing the goodbye message would remove that Sentinel from that pod's list of known Sentinels. With this in place we would have it be the case tha known-Sentinels would always be accurate.

Summary

These changes are relatively minor ones to Sentinel, but they can have major positive rewards for Sentinel users. Ranging from Sentinel identification to better quorum calculations, and the improved experience for operations (and developers) the sync-ed calls to Sentinel would bring I think these are quick wins we could accomplish in short order. Further, they would not negatively effect (other than time of course) the other, more drastic, changes we are considering for Sentinel.

From The Deepening Dark

In keeping with the spirit of both "simple and noncontroversial" changes as well as "whoah, that's different!" ones, here is a significant change idea I want to throw out to germinate: optionally storing logical master state/config in a backing store such as Consul. I'll use Consul as my example for a few reasons, not the least of which is my familiarity with it over eg. Etcd..

If Sentinel stored it's per-pod information in Consul, say /sentinel/<id-or-name>/<pod-name> it would remove, if enabled, the problem of storing this information in the config file - and thus mixing operational daemon config with the state of the systems it manages. From an operational standpoint this would be nice because it would remove the chances a version upgrade or configuration management system 'oopses' your managed pods away.

Further, if using Consul, you can integrate with other tools such as using consul-template to automatically detect failovers and update HAproxy or Twemproxy. This would be very handy when you want to isolate the client from needing sentinel support - perhaps in cases where the client library doesn't support it properly or at all.

Admittedly this would be a significant undertaking, though it would piggyback off of the proposal to add this capability into Redis itself. Naturally it would be an option but in combination with a better interaction with containerized deployments could really extend operational aspects of Sentinel. One of the larger open questions would be "If I added or removed a pod directly via the backing store, what would/should Sentinel do?". Still, I believe it to be a good avenue to explore.