MySQL Challenge: 100k Connections


Vadim Tkachenko2019-02-25T10:00:28+00:00

In this post, I want to explore a way to establish 100,000 connections to MySQL. Not just idle connections, but executing queries.

100,000 connections. Is that really needed for MySQL, you may ask? Although it may seem excessive, I have seen a lot of different setups in customer deployments. Some deploy an application connection pool, with 100 application servers and 1,000 connections in each pool. Some applications use a “re-connect and repeat if the query is too slow” technique, which is a terrible practice. It can lead to a snowball effect, and could establish thousands of connections to MySQL in a matter of seconds.

So now I want to set an overachieving goal and see if we can achieve it.

Setup

For this I will use the following hardware:

Bare metal server provided by packet.net, instance size: c2.medium.x86 Physical Cores @ 2.2 GHz (1 X AMD EPYC 7401P) Memory: 64 GB of ECC RAM

Storage : INTEL® SSD DC S4500, 480GB

This is a server grade SATA SSD.

I will use five of these boxes, for the reason explained below. One box for the MySQL server and four boxes for client connections.

For the server I will use Percona  Server for MySQL 8.0.13-4 with the thread pool plugin. The plugin will be required to support the thousands of connections.

Initial server setup

Network settings (Ansible format):

These are the typical settings recommended for 10Gb networks and high concurrent workloads.

Limits settings for systemd:

And the relevant setting for MySQL in my.cnf:

For the client I will use sysbench version 0.5 and not 1.0.x, for the reasons explained below.

The workload is sysbench --test=sysbench/tests/db/select.lua --mysql-host=139.178.82.47 --mysql-user=sbtest --mysql-password=sbtest --oltp-tables-count=10 --report-interval=1 --num-threads=10000 --max-time=300 --max-requests=0 --oltp-table-size=10000000 --rand-type=uniform --rand-init=on run

Step 1. 10,000 connections

This one is very easy, as there is not much to do to achieve this. We can do this with only one client. But you may face the following error on the client side:

FATAL: error 2004: Can't create TCP/IP socket (24)

This is caused by the open file limit, which is also a limit of TCP/IP sockets. This can be fixed by setting   ulimit -n 100000  on the client.

The performance we observe:

Step 2. 25,000 connections

With 25,000 connections, we hit an error on MySQL side:

Can't create a new thread (errno 11); if you are not out of available memory, you can consult the manual for a possible OS-dependent bug

If you try to lookup information on this error you might find the following article:  https://www.percona.com/blog/2013/02/04/cant_create_thread_errno_11/

But it does not help in our case, as we have all limits set high enough:

This is where we start using the thread pool feature:  https://www.percona.com/doc/percona-server/8.0/performance/threadpool.html

Add:

thread_handling=pool-of-threads

to the my.cnf and restart Percona Server

The results:

We have the same throughput, but actually the 95% response time has improved (thanks to the thread pool) from 3690 ms to 979 ms.

Step 3. 50,000 connections

This is where we encountered the biggest challenge. At first, trying to get 50,000 connections in sysbench we hit the following error:

FATAL: error 2003: Can't connect to MySQL server on '139.178.82.47' (99)

Error (99) is cryptic and it means: Cannot assign requested address.

It comes from the limit of ports an application can open. By default on my system it is

cat /proc/sys/net/ipv4/ip_local_port_range : 32768   60999

This says there are only 28,231 available ports — 60999 minus 32768 — or the limit of TCP connections you can establish from or to the given IP address.

You can extend this using a wider range, on both the client and the server:

echo 4000 65000 > /proc/sys/net/ipv4/ip_local_port_range

This will give us 61,000 connections, but this is very close to the limit for one IP address (maximal port is 65535). The key takeaway from here is that if we want more connections we need to allocate more IP addresses for MySQL server. In order to achieve 100,000 connections, I will use two IP addresses on the server running MySQL.

After sorting out the port ranges, we hit the following problem with sysbench:

In this case, it’s a problem with sysbench memory allocation (namely lua subsystem). Sysbench can allocate memory for only 32,351 connections. This is a problem which is even more severe in sysbench 1.0.x.

Sysbench 1.0.x limitation

Sysbench 1.0.x uses a different Lua JIT, which hits memory problems even with 4000 connections, so it is impossible to go over 4000 connection in sysbench 1.0.x

So it seems we hit a limit with sysbench sooner than with Percona Server. In order to use more connections, we need to use multiple sysbench clients, and if 32,351 connections is the limit for sysbench, we have to use at least four sysbench clients to get up to 100,000 connections.

For 50,000 connections I will use 2 servers (each running separate sysbench), each running 25,000 threads from sysbench.

The results for each sysbench looks like:

So we have about the same throughput (16794*2 = 33588 tps in total), however the 95% response time doubled. This is to be expected as we are using twice as many connections compared to the 25,000 connections benchmark.

Step 3. 75,000 connections

To achieve 75,000 connections we will use three servers with sysbench, each running 25,000 threads.

The results for each sysbench:

Step 4. 100,000 connections

There is nothing eventful to achieve75k and 100k connections. We just spin up an additional server and start sysbench. For 100,000 connections we need four servers for sysbench, each shows:

So we have the same throughput (8065*4=32260 tps in total) with 3405ms 95% response time.

A very important takeaway from this: with 100k connections and using a thread pool, the 95% response time is even better than for 10k connections without a thread pool. The thread pool allows Percona Server to manage resources more efficiently and provides better response times.

Conclusions

100k connections is quite achievable for MySQL, and I am sure we could go even further. There are three components to achieve this:

  • Thread pool in Percona Server
  • Proper tuning of network limits
  • Using multiple IP addresses on the server box (one IP address per approximately 60k connections)

Appendix: full my.cnf