Scaling ERPNext with MySQL
Scaling is what I call a "good" problem to have. As we keep adding customers and trial accounts, our
Scaling is what I call a "good" problem to have. As we keep adding customers and trial accounts, our servers have started showing signs of overloading. To give you an idea, we have close to 100 live ERPs and another 100 trials running on our server. Each account is a separate database instance with over 200 tables. Here is our configuration:
- Intel i7 8-core processor, 1.6GHz
- 24GB RAM
- Seagate 1500GB hard disk.
As you know our database is InnoDB (MySQL) and the current size of our InnoDB file is 16GB.
Though we are reasonably passable when it comes to writing software, none of us is an expert in infrastructure issues and system administration, though Anand comes closest to being somewhat qualified. So here is what we did.
To monitor the system performance we use:
- MySQL Slow Query Log: This records queries that take long to execute.
- MySQL processlist: we keep refreshing this to see how many pending queries are there at one time.
- HTOP: The nice graphical utility that shows how the CPU and RAM are performing.
The usual suspects for any performance issues are RAM and CPU utilization. So, initially when we received complaints of slow responses, we checked all the above and we did not see anything unusual, the CPU and RAM were well below usage. It was only during upgrades that we saw all the 8 CPUs being fully used.
The slow query log showed something unsual though. Apart from the occasional complex join query, most of it was filled with "COMMIT" statements. If you know a bit of database, this basically means the point where all the updates are becoming "permanent" i.e. being written to the disk.
My first reaction was that - now this isn't helpful. I mean its not telling me if the write is slow for an Invoice entry or a Stock entry or something like that, so I went back to the processlist to see what tables are getting locked, which is not too helpful either. Its like trying to read the destination written on the sides a running train, which I occasionally try to do when I am waiting on the railway platform.
Sometimes when you start a line of thinking, you miss the obvious. I assumed that we can improve our performance by identifing which part of the code can be optimized, because I believed that we had enough hardward to get this running. I ignored the writing on the wall.
Finally, it struck me, in what we call in Mumbai, a classical "tubelight".
The hard-disk was the bottleneck!
So I started to investigate a bit and reading a few blogs . The first thing you get is into RAID systems and how you can optimize performance on them. Now since I am a compelete newbie, I had heard the word RAID before, but had no idea, also it is something that we needed to decide at the time of setting up the servers. Since we were well past that, I looked into ways on how to veryify if my hypothesis is correct.
In Linux you can find almost anything you are looking for and I was looking for a utility called iostat. This tells me how to monitor performance of the disks. After learning a bit on how to read iostat, I realized that the key figure I need to look at to test my hypothesis was the amount of write instructions on the disk. The next thing I wanted are benchmarks, so that I know what is a resonable average queue size.
Unfortunately there was no benchmark available, but a lot of people had posted what their iostat outputs. When I saw this, it immediately became clear that the amount of stuff we were writing was very high:
So I think we can conclude that along with the above table and the "COMMIT"s in the slow query log, the hypothesis is correct, the bottleneck is the hard disk. You can process as many parallel requests as you can but all need to queue up to write on the disk. Apart from that, we also have a web server running, and god knows how many logs waiting to write.
This means that while the status of usual suspects, RAM and CPU was this:
The status of the hard-disk was this:
Now that our stupidity is obvious, we need to quickly decide what to do. I read up a bit on RAID, but its seems an added level of complexity. What I am thinking is adding more servers with fewer accounts on each server, with not as much RAM and CPU horsepower. This will give us capacity to add low cost machines and balance the traffic between memory and disk.
If you are an expert, would love to hear from you.
Post Script #1
Thanks for the awesome response. We are running a master-slave setup (for redundancy). We need binlogs not only for this but also for debugging.
Here is some more info:
This is basically the standard config that gets shipped with Percona release with some increased caches:
Output from mysqltuner.pl:
-------- General Statistics --------------------------------------------------[--] Skipped version check for MySQLTuner script[OK] Currently running supported MySQL version 5.0.86-ius+percona-highperf-log[OK] Operating on 64-bit architecture-------- Storage Engine Statistics -------------------------------------------[--] Status: +Archive +BDB +Federated +InnoDB -ISAM -NDBCluster[--] Data in MyISAM tables: 322M (Tables: 931)[--] Data in InnoDB tables: 6G (Tables: 70633)[!!] BDB is enabled but isn't being used[!!] Total fragmented tables: 91-------- Performance Metrics -------------------------------------------------[--] Up for: 43d 22h 25m 9s (347M q [91.574 qps], 11M conn, TX: 541B, RX: 30B)[--] Reads / Writes: 73% / 27%[--] Total buffers: 10.2G global + 11.2M per thread (500 max threads)[OK] Maximum possible memory usage: 15.7G (66% of installed RAM)[OK] Slow queries: 0% (5K/347M)[OK] Highest usage of available connections: 16% (83/500)[OK] Key buffer size / total MyISAM indexes: 64.0M/2.7M[OK] Key buffer hit rate: 99.8% (610M cached / 1M reads)[OK] Query cache efficiency: 56.4% (104M cached / 185M selects)[!!] Query cache prunes per day: 454539[OK] Sorts requiring temporary tables: 1% (48K temp sorts / 4M sorts)[!!] Temporary tables created on disk: 40% (12M on disk / 31M total)[OK] Thread cache hit rate: 99% (9K created / 11M connections)[!!] Table cache hit rate: 0% (2K open / 14M opened)[OK] Open file limit used: 0% (121/65K)[OK] Table locks acquired immediately: 99% (111M immediate / 111M locks)[OK] InnoDB data size / buffer pool: 6.5G/10.0G-------- Recommendations -----------------------------------------------------General recommendations:Add skip-bdb to MySQL configuration to disable BDBRun OPTIMIZE TABLE to defragment tables for better performanceWhen making adjustments, make tmp_table_size/max_heap_table_size equalReduce your SELECT DISTINCT queries without LIMIT clausesIncrease table_cache gradually to avoid file descriptor limitsVariables to adjust:query_cache_size (> 32M)tmp_table_size (> 64M)max_heap_table_size (> 64M)table_cache (> 2048)
Post Script #2
I had mistakenly idenified packet size in iostat as the queue size and had posted a screenshot. I have removed it so as not to mislead. The original hypothesis remains correct. By reducing the number of writes, we were able to speed up the processing.
But there is no alternative to comprehensive montoring.
 Blog to read on analyze disk bottlneck issues:
 Picture Courtesy: Michael Loke, Creative Commons via Filckr
 Picture Courtesy: Frederick Bisson, Creative Commons via Flickr
Rushabh is a software developer and founder of ERPNext. He usually writes about the startup experience, open source and the technologies he is working on.
Thanks for the execellent tips.
We do you Percona. We did a lot more optimization in the a
This is 1+ year old but thought I would give you my todo list for scaling the database
What do you mean by a "metadata driven engine"?
You might want to check if your updates are creating temporary tables. That could be an issue too
@Ravi - I think he's saying that his database size is 16GB which is orders of magnitude smaller t
@Chris, mostly inserts and updates. We have removed most of the redundancies, but the General Led
I think you have answered your question with that POST vs GET thing. You need to remove empty com
Probably nor elated, but disable BDB. You don't need it.
That's a lot of badly fragmented
@Rushabh So as it seems, your DB is fragmented, this causes write to be broken. Fix that.
Thanks for the awesome comments and feedback.
@Scalability Genius, will have to ask our p
Me again. Darned iPad. Change "snorts" to "inserts" and for "won't RAID" I of course meant "want
Post your my.cnf
Looks like you have a lot of writes going on there. Look at the wkB/sec! Massive. Almost no reads
@Scalability Genius: Yea; it will work; But imagine the cost of buying a 1.5TB SSD disk. !!
The first thing you should do is to switch to an SSD.