High Availability
Rauthy is capable of running in a High Availability Mode (HA).
Some values, like authentication codes for instance, do live in the cache only. Because of this, all instances create and share a single HA cache layer, which means at the same time, that you cannot just scale up the replicas infinitely without adjusting the config. The optimal amount of replicas for a HA mode would be 3, or if you need even higher resilience 5. More replicas should work just fine, but at some point, the write throughput will degrade.
The Cache layer uses Hiqlite. It uses the Raft algorithm under the hood to achieve consistency.
Even though everything is authenticated, you should not expose the Hiqlite ports to the public, if
not really necessary for some reason. You configure these ports with the cluster.nodes config
value.
Some container runtimes will force-kill very quickly. When Rauthy is deployed as a HA cluster, it will usually take at least 15 seconds to do a graceful shutdown. Depending on the config and current cluster state (maybe there is a leader election ongoing and so on), it may take up to 25 - 30 seconds.
Makre sure to adjust your container config.
Configuration
Earlier versions of Rauthy have been using redhac for the HA
cache layer. While redhac was working fine, it had a few design issues I wanted to get rid of.
Since v0.26.0, Rauthy uses the above-mentioned Hiqlite
instead. You only need to configure a few variables.
Even when using Postgres as your DB of choice, you should provide a persistent volume for your Rauthy instances. The reason is that the cache is disk-backed. By default, it will store at least WAL logs and snapshots on disk. All working data will be kept in memory for the fastest access. Having the WAL logs on disk though is a huge advantage when it comes to restarts or rolling releases. The Raft cluster needs a persistent state, which lives inside the WAL logs. If they are being wiped with each restart (no persistent volume), the node has to re-join the Raft cluster each time, which takes time.
However, you can choose to run the cache fully in-memory as well. In that scenario, nodes will gracefully leave the Raft cluster on shutdown and cleanly re-join with the next start. However, it is very important that the data dir exists. You don't need to care about this in a container, but when running it natively for instance. Even a fully in-memory cache, Rauthy will use the data dir for temporarily storing WAL snapshots. These are necessary when other nodes join the cluster, and to clean up WAL logs from memory at some point.
node_id
The cluster.node_id is mandatory, even for a single replica deployment with only a single node in
cluster.nodes. If you deploy Rauthy as a StatefulSet inside Kubernetes, you can ignore this value
and just set HQL_NODE_ID_FROM below. If you deploy anywhere else, or you are not using a
StatefulSet, you need to set the node_id_from to tell Rauthy which node of the Raft cluster it
should be.
[cluster]
# The node id must exist in the nodes and there must always be
# at least a node with ID 1
# Will be ignored if `node_id_from = k8s`
#
# At least `node_id_from` or `node_id` are required.
#
# default: 0 (invalid)
# overwritten by: HQL_NODE_ID
node_id = 1
node_id_from
If you deploy to Kubernetes as a StatefulSet, you should ignore the cluster.node_id and just set
cluster.node_id_from = "k8s". This will parse the correct NodeID from the Pod hostname, so you
don't have to worry about it.
[cluster]
# Can be set to 'k8s' to try to split off the node id from the hostname
# when Hiqlite is running as a StatefulSet inside Kubernetes.
#
# default: unset
# overwritten by: HQL_NODE_ID_FROM
node_id_from = "k8s"
nodes
This value defines the Raft members. It must be given even if you just deploy a single instance. The description from the reference config should be clear enough:
[cluster]
# All cluster member nodes. For a single instance deployment,
# `"1 localhost:8100 localhost:8200"` will work just fine.
# Each array value must have the following format:
# `id addr_raft addr_api`
#
# default: ["1 localhost:8100 localhost:8200"]
# overwritten by: HQL_NODES
nodes = [
"1 localhost:8100 localhost:8200",
# "2 localhost:8101 localhost:8201",
# "3 localhost:8102 localhost:8202",
]
secret_raft + secret_api
Since you need both cluster.secret_raft and cluster.secret_api in any case, there is nothing to
change here. These define the secrets being used internally to authenticate against the Raft or the
API server for Hiqlite. You can generate safe values with any tool you like, for instance:
cat /dev/urandom | tr -dc 'a-zA-Z0-9' | head -c48
TLS
If you are using a service mesh like for instance linkerd which creates mTLS
connections between all pods by default, you can use the HA cache with just plain HTTP, since
linkerd will encapsulate the traffic anyway. In this case, there is nothing to do.
However, if you do not have encryption between pods by default, I would highly recommend, that you use TLS.