High Availability

When you turn ON High Availability (HA), we will spin up (at least) 3 nodes in your cluster in 3 different physically & geographically separate data centers, each of which can service reads and writes.

So even if one of these nodes is inaccessible for any reason (for eg: due to underlying infrastructure issues or due to maintenance), the other two nodes can continue servicing requests with zero downtime.

When you enable HA, you will get a special load-balanced hostname and searches sent to this endpoint will automatically be distributed to one of the nodes in your cluster.

Writes can also be sent to this load-balanced endpoint and your data will be automatically replicated to all the nodes in your cluster.

We highly recommend enabling High Availability when running Typesense in a production environment.

With HA turned ON, you can avoid a downtime during the following scenarios:

Infrastructure issues: in a multi-node HA cluster, even if one node or one datacenter has underlying hardware issues, the other two nodes in the cluster will continue servicing traffic, as the problematic node recovers.
Capacity changes: when RAM / CPU is changed in a multi-node HA cluster, we will upgrade/downgrade one node at a time and your cluster will continue servicing traffic from the other two nodes, as the configuration change happens on the 3rd one.
Typesense version changes: when the Typesense Server version is changed in a multi-node HA cluster, we will upgrade/downgrade one node at a time and your cluster will continue servicing traffic from the other two nodes, as the version change happens on the 3rd one.
Maintenance: we typically tend to do maintenance on the underlying OS every 1-2 months. In a multi-node cluster, we will service one node at a time and your cluster will continue servicing traffic from the other two nodes, as the 3rd one is being serviced.

With HA turned OFF, you will experience a downtime proportional to the size of your dataset during the scenarios above. For scenario #1, the downtime could be several hours depending on infrastructure recovery times. For scenario #2, #3 and #4 this could be anywhere from 15 minutes to 2 hours as the single non-HA node in your cluster recovers.

Read more about High Availability in Typesense Cloud here.

High Availability

Related Articles