High Availability refers to achieving the following goals in your setup
- Availability of service in case of single or multiple node failures.
- No data loss in case of single or multiple node failures.
- Availability of good read throughput in case of high traffic.
The Ideal Architecture
Load balancer
The load balancer distributes incoming HTTP requests among application server and the service isn't disrupted in case of a node failure. This makes the service "horizontally" scalable in terms of compute load. If you have only one of these, it's a single point of failure.
Normally, you'd have more than one of Load balancers and they themselves loadbalanced via DNS (which is distributed in true sense). If you have the budget, time and expertise, you can also have the multiple loadbalancers setup via a heartbeat like system.
Application Servers
Application servers process incoming HTTP requests. They query databases/cache if required but do not maintain any state. As long as one of these is up, the service isn't down.
Background Workers
Background workers execute scheduled jobs.
Memcached and Redis Services
ERPNext also depends on Memcached for caching and Redis as a broker for background task workers.
Memcached is distributed by design and adding all available memached nodes in application server configuration is all the configurations that is required. Failure of memcached doesn't cause the service to go down. Failure of one or more memcached nodes is handled automatically by the client (application server).
Failure of a Redis server would cause scheduled and backround tasks of the service to go down. It's possible to setup multiple redis servers in master-slave fashion and software exists to perform automatic failovers (http://redis.io/topics/sentinel).
Database (MariaDB) cluster
The Database cluster consists of multiple nodes and exposes itself as transparent database service to the application servers.
This enables multi node failure and ensures "availability" and "no data loss" in case of a node failure. This however, increases the complexity of your setup.
Setting up a cluster with automatic fail over requires setup of complex monitoring and cluster management software (such as Galera from MariaDB or Percona Cluster Manager). They typical run in multi master and a few read slaves setup. We do not have experience in running this stack in production.
Also, having automatic failover configured is risky. If something goes wrong when your sysadmin is away, data inconsistency/loss or availability issues might occur. This happens to the best in the industry too, https://github.com/blog/1261-github-availability-this-week.
What we do
What we offer is the (simple) setup below. Although single server setup acts like a single point of failure, chances of complexities during failover are less.
Crash plan
- Backup & rsync (if possible)
- change slave to master
- start services (redis, supervisor, nginx) on slave
- switch DNS
·
I'm glad I found this web site, I couldn't find any knowledge on this matter prior to.Also operate a site and if you are ever interested in doing some visitor writing for me if possible feel free to let me know, im always look for people to check out my web site.
·
Een personeelsuitje in het bruisende centrum van Utrecht is dé manier om je team bij elkaar te brengen en te genieten van alles wat deze prachtige stad te bieden heeft. Of het nu gaat om een team- of bedrijfsuitje, iedereen vindt hier wel iets naar zijn smaak. Van culturele hoogtepunten tot gezellige cafés en restaurants, er is voor elk wat wils. Een personeelsuitje is niet alleen leuk, maar helpt ook om de teamspirit te versterken en collega's beter te leren kennen. Maak van het uitje iets speciaals door te kiezen voor een activiteit zoals een escape room of een stadswandeling met een gids. Wil je graag even ontsnappen aan de dagelijkse sleur van werk? Overweeg dan een bedrijfsuitje in het centrum van Utrecht! Het is de ideale gelegenheid om je team te versterken en elkaar beter te leren kennen. Er zijn tal van activiteiten en locaties om uit te kiezen, dus je vindt zeker een uitje dat past bij jouw bedrijf. Het organiseren van zo'n teamuitje kan lastig zijn, maar gelukkig zijn er genoeg organisaties in Utrecht die je daarbij kunnen helpen. Waar wacht je nog op? Boek vandaag nog jouw bedrijfsuitje in het centrum van Utrecht!
·
Thanks for posting this info. I just want to let you know that I just check out your site and I find it very interesting and informative. I can't wait to read lots of your posts.
·
Thanks for the blog. Its helpful
·
No matter what our particular case I think a HA arrangement of ERPNext that is scaleable and will endure server disappointment without client experience issues is a fascinating conversation to be had. These are only a few thoughts, I would see the value in a few contributions from individuals who have comparable encounters!
·
Hi, I want to transfer live server's public and private folders and database to testing server periodically in ERPNext using rsync. Any help regarding this.
·
thanks