Redis Sentinel is a system designed to help managing Redis instances. It performs the following four tasks:
If you are using the
redis-sentinel executable (or if you have a symbolic link with that name to the
redis-server executable) you can run Sentinel with the following command line:
Otherwise you can use directly the
redis-server executable starting it in Sentinel mode:
redis-server /path/to/sentinel.conf --sentinel
Sentinels by default run listening for connections to TCP port 26379, so for Sentinels to work, port 26379 of your servers must be open to receive connections from the IP addresses of the other Sentinel instances. Otherwise Sentinels can't talk and can't agree about what to do, so failover will never be performed.
You only need to specify the masters to monitor, giving to each separated master (that may have any number of slaves) a different name. There is no need to specify slaves, which are auto-discovered. Sentinel will update the configuration automatically with additional informations about slaves (in order to retain the information in case of restart). The configuration is also rewritten every time a slave is promoted to master during a failover.
The example configuration above, basically monitor two sets of Redis instances, each composed of a master and an undefined number of slaves. One set of instances is called
mymaster, and the other
For the sake of clarity, let's check line by line what the configuration options mean:
The first line is used to tell Redis to monitor a master called mymaster, that is at address 127.0.0.1 and port 6379, with a level of agreement needed to detect this master as failing of 2 sentinels (if the agreement is not reached the automatic failover does not start).
However note that whatever the agreement you specify to detect an instance as not working, a Sentinel requires the vote from the majority of the known Sentinels in the system in order to start a failover and obtain a new configuration Epoch to assign to the new configuration after the failover.
In the example the quorum is set to to 2, so it takes 2 sentinels that agree that a given master is not reachable or in an error condition for a failover to be triggered (however as you'll see in the next section to trigger a failover is not enough to start a successful failover, authorization is required).
The other options are almost always in the form:
sentinel <option_name> <master_name> <option_value>
And are used for the following purposes:
down-after-millisecondsis the time in milliseconds an instance should not be reachable (either does not reply to our PINGs or it is replying with an error) for a Sentinel starting to think it is down. After this time has elapsed the Sentinel will mark an instance assubjectively down (also known as
SDOWN), that is not enough to start the automatic failover. However if enough instances will think that there is a subjectively down condition, then the instance is marked as objectively down. The number of sentinels that needs to agree depends on the configured agreement for this master.
parallel-syncssets the number of slaves that can be reconfigured to use the new master after a failover at the same time. The lower the number, the more time it will take for the failover process to complete, however if the slaves are configured to serve old data, you may not want all the slaves to resync at the same time with the new master, as while the replication process is mostly non blocking for a slave, there is a moment when it stops to load the bulk data from the master during a resync. You may make sure only one slave at a time is not reachable by setting this option to the value of 1.
Additional options are described in the rest of this document and documented in the example
sentinel.conf file shipped with the Redis distribution.
All the configuration parameters can be modified at runtime using the
SENTINEL SET command. See the Reconfiguring Sentinel at runtimesection for more information.
The previous section showed that every master monitored by Sentinel is associated to a configured quorum. It specifies the number of Sentinel processes that need to agree about the unreachability or error condition of the master in order to trigger a failover.
However, after the failover is triggered, in order for the failover to actually be performed, at least a majority of Sentinels must authorized the Sentinel to failover.
Let's try to make things a bit more clear:
The difference may seem subtle but is actually quite simple to understand and use. For example if you have 5 Sentinel instances, and the quorum is set to 2, a failover will be triggered as soon as 2 Sentinels believe that the master is not reachable, however one of the two Sentinels will be able to failover only if it gets authorization at least from 3 Sentinels.
If instead the quorum is configured to 5, all the Sentinels must agree about the master error condition, and the authorization from all Sentinels is required in order to failover.
Sentinels require to get authorizations from a majority in order to start a failover for a few important reasons:
When a Sentinel is authorized, it gets an unique configuration epoch for the master it is failing over. This is a number that will be used to version the new configuration after the failover is completed. Because a majority agreed that a given version was assigned to a given Sentinel, no other Sentinel will be able to use it. This means that every configuration of every failover is versioned with an unique version. We'll see why this is so important.
Moreover Sentinels have a rule: if a Sentinel voted another Sentinel for the failover of a given master, it will wait some time to try to failover the same master again. This delay is the
failover-timeout you can configure in
sentinel.conf. This means that Sentinels will not try to failover the same master at the same time, the first to ask to be authorized will try, if it fails another will try after some time, and so forth.
Redis Sentinel guarantees the liveness property that if a majority of Sentinels are able to talk, eventually one will be authorized to failover if the master is down.
Redis Sentinel also guarantees the safety property that every Sentinel will failover the same master using a different configuration epoch.
By default Sentinel runs using TCP port 26379 (note that 6379 is the normal Redis port). Sentinels accept commands using the Redis protocol, so you can use
redis-cli or any other unmodified Redis client in order to talk with Sentinel.
There are two ways to talk with Sentinel: it is possible to directly query it to check what is the state of the monitored Redis instances from its point of view, to see what other Sentinels it knows, and so forth.
An alternative is to use Pub/Sub to receive push style notifications from Sentinels, every time some event happens, like a failover, or an instance entering an error condition, and so forth.
The following is a list of accepted commands:
<master name>Show the state and info of the specified master.
<master name>Show a list of slaves for this master, and their state.
<master name>Return the ip and port number of the master with that name. If a failover is in progress or terminated successfully for this master it returns the address and port of the promoted slave.
<pattern>This command will reset all the masters with matching name. The pattern argument is a glob-style pattern. The reset process clears any previous state in a master (including a failover in progress), and removes every slave and sentinel already discovered and associated with the master.
<master name>Force a failover as if the master was not reachable, and without asking for agreement to other Sentinels (however a new version of the configuration will be published so that the other Sentinels will update their configurations).