ERPNext High Availability

A look into ERPNext deployment architecture

 · 2 min read

High Availability refers to achieving the following goals in your setup

  • Availability of service in case of single or multiple node failures.
  • No data loss in case of single or multiple node failures.
  • Availability of good read throughput in case of high traffic.

The Ideal Architecture

Load balancer

The load balancer distributes incoming HTTP requests among application server and the service isn't disrupted in case of a node failure. This makes the service "horizontally" scalable in terms of compute load. If you have only one of these, it's a single point of failure.

Normally, you'd have more than one of Load balancers and they themselves loadbalanced via DNS (which is distributed in true sense). If you have the budget, time and expertise, you can also have the multiple loadbalancers setup via a heartbeat like system.

Application Servers

Application servers process incoming HTTP requests. They query databases/cache if required but do not maintain any state. As long as one of these is up, the service isn't down.

Background Workers

Background workers execute scheduled jobs.

Memcached and Redis Services

ERPNext also depends on Memcached for caching and Redis as a broker for background task workers.

Memcached is distributed by design and adding all available memached nodes in application server configuration is all the configurations that is required. Failure of memcached doesn't cause the service to go down. Failure of one or more memcached nodes is handled automatically by the client (application server).

Failure of a Redis server would cause scheduled and backround tasks of the service to go down. It's possible to setup multiple redis servers in master-slave fashion and software exists to perform automatic failovers (

Database (MariaDB) cluster

The Database cluster consists of multiple nodes and exposes itself as transparent database service to the application servers.

This enables multi node failure and ensures "availability" and "no data loss" in case of a node failure. This however, increases the complexity of your setup.

Setting up a cluster with automatic fail over requires setup of complex monitoring and cluster management software (such as Galera from MariaDB or Percona Cluster Manager). They typical run in multi master and a few read slaves setup. We do not have experience in running this stack in production.

Also, having automatic failover configured is risky. If something goes wrong when your sysadmin is away, data inconsistency/loss or availability issues might occur. This happens to the best in the industry too,

What we do

What we offer is the (simple) setup below. Although single server setup acts like a single point of failure, chances of complexities during failover are less.

Crash plan

  • Backup & rsync (if possible)
  • change slave to master
  • start services (redis, supervisor, nginx) on slave
  • switch DNS

Pratik Vyas

Pratik takes care of Frappecloud and nags everyone about blasphemous engineering practices. He's also responsible for any cryptic responses and texts related to frappe and erpnext that you may find.

Add a comment
Ctrl+Enter to add comment

Pankaj Wankhede 1 month ago

Hi, I want to transfer live server's public and private folders and database to testing server periodically in ERPNext using rsync. Any help regarding this.

cialis 20mg 4 months ago

We have sell some products of different custom is very useful and very low price please visits this site thanks and please share this post with your friends.

대출나라 4 months ago

This is such a great resource that you are providing and you give it away for free. I love seeing blog that understand the value of providing a quality resource for free.


It is a very interesting topic. I think that when you read informative blog you gain new knowledge and it develops our mind.

buy now 4 months ago

Hello There. I found your blog using msn. This is an extremely well written article. I will be sure to bookmark it and return to read more of your useful information. Thanks for the post. I’ll certainly comeback.

James 5 months ago

these are best hacks to understand

dissertation writing london" style=""> dissertation writing london">

High availability has been made for the good purposes and to achieve the important goals. If it still works after the failure of nodes then this high availability database will be beneficial for the companies that have large data.

Gil Salazar 1 year ago