I spent the weekend testing out load balancing with various configurations - using docker, nginx and various proof-of-concept REST api servers. All to help make decisions regarding what to use in the future of a project.
Long story short, I have a REST application which handles 134k requests/s (a test case) when run natively. Its a highly optimized framework (giraffe/kestrel). This performance is expected on the hardware I have.
As soon as I put it behind nginx (same box) with upstream load balancing the requests fall to 24k/s.
Iāve tried various configurations - running 4 instances of the application (docker) or one instance - it makes no difference.
I am using autocannon to measure rps scores, running it from a mac-mini with application/nginx running on i5-6600k dev server.
My question isā¦ am I expecting too much? Is nginx just inherently slow? Is it normal to lose this much performance with a load balancer? I donāt have deep experience with horisontal scaling, but I was under assumption that at most there would be a latency hit, not massive throughput hit like this.
The optimal value depends on many factors including (but not limited to) the number of CPU cores, the number of hard disk drives that store data, and load pattern. When one is in doubt, setting it to the number of available CPU cores would be a good start (the value ā auto ā will try to autodetect it).
Worker Connections
Sets the maximum number of simultaneous connections that can be opened by a worker process.
It should be kept in mind that this number includes all connections (e.g. connections with proxied servers, among others), not only connections with clients. Another consideration is that the actual number of simultaneous connections cannot exceed the current limit on the maximum number of open files, which can be changed by worker_rlimit_nofile.
This bit right here:
The number of connections is limited by the maximum number of open files (RLIMIT_NOFILE) on your system
I did make some notes of this in my own configuration when I was talking about hardening. Thereās a few things you can do ā¦ Let me edit in what I have
Heres my events block
## Events Block
events {
# High Throughput Settings
worker_connections 65535;
multi_accept on;
use epoll;
}
There are some things you can do to optimize the HTTP{} block too which I assume you are load balancingā¦ Id have to pick mine apart a bitā¦ so im gonna go see what I have in there on mine. Of course YMMV
I have the following additional parameters: (mostly caching and buffer tweaks as well as TCP tweaks and timeouts)
You also could try to move to elliptic curve encyption as much as possible if you find its the handshake that is slowing things down. Thats in the hardening thing I wrote
Thats most of my work and its solid now
I hated figuring this out the first time lol
be aware
client_max_body_size 30G;
Is a setting to match my nextcloud through the reverse proxy. You dont necessarily need to set this.
Dynamic just informed me in casual chit chat that you can indeed set this to unlimited if you dont want to tune it
Also if the servers arenāt equal in gruntā¦ List them in the balancer from most powerful to least. NGINX reads in order and balances in order. Not sure if it round robins but this is what Iāve found
Gentlemen, thank you for your suggestions. I ended up digging further into nginx than I ever had in last 10+ years of setting it up as reverse proxy in various environments. Its pretty excellent software.
This is REST only proxy, so no files are ever read. Tried everything, ignoring body, turning compression off/on, forcing http1.1ā¦but it just would not budge past 25k rps.
I know there are more sophisticated debug tools that can show whats really slowing it down, but this is essentially a pass-through URL that returns some json - a step above āhello worldā.
I was going to reinstall whole system, since this dev system is ubuntu desktop 20.04 that has been distro upgraded several times - my only guess was there was something wrong with networking that is maybe mis-configured and out of my depth.
So Iāve tried HAProxy, just to confirm that its not isolated to nginx - and it handled 114k rps. Running on same host its reasonable. Iāve ended up going with it for now (its also as simple to set up as nginx, unlike traefik). There are drawbacks but I will revisit if serving static content becomes requirement.
If its a dev system and not prodā¦ I hate to suggest but why not Fedora or arch. If your hell bent on reinstalling it might be prudent to install something more āagileā to say the least. Then again I dont know any details of your setup and if this conflicts with that dont follow that adviceā¦
Sounds good dude. thats territory im not familiar with. NGINX is my swiss army knife but if HA works better use it
I would love to, arch sounds really cool. I am keeping it Ubuntu because most of NVidia + Tensorflow packages are configured for Ubuntu/Debian. Path of least resistance. In production its almost always an image container.