Background
I went to Intel Innovation 2023 to try to get a better understanding of intel’s overall strategy for server processors going forward – at least when it comes to “Uncore” stuff like accelerators.
I’ve already done a lot of work with Glenn Berry on Microsoft SQL server and QAT. Intel’s QAT enables a unique and game-changing bit of functionality to do with backup acceleration and compression. On a busy SQL server it doesn’t take very much CPU away from doing the SQL server stuff but you still get great hardware-accelerated encryption when your server processor has QAT. (Software can still be used in lieu of hardware, but that eats into the available CPU overhead).
Compression/Decompression isn’t the only useful thing QAT can do: it can also offload TLS encryption, like the kind used on the web, for a web server. This can dramatically lower the amount of cpu overhead setting up and tearing down TLS connections.
I set out to investigate
Intel Docker/Kubernetes resources
This is the main resource:
This is a great source of knowledge and ready-to-roll demos not just for QAT, but also FPGA, DSA and even just GPU demos.
It can be a little discouraging when some of the first search hits for docker/QAT intel demos is this demo last updated 3 years ago:
https://hub.docker.com/r/intel/crypto-reference-stack
… it is possible to find much more recent demonstrations of TLS acceleration with nginx as used by other commercial projects on github:
Getting Started
When running docker, or anything else, it is important the host have the QAT driver installed, and working.
This guide is the most reasonable start on that, if a little obtuse.
Before proceeding further use the qzip
utility to test that the QATengine can be used for data compression to confirm its working properly.
While the docker images contain QATzip and QATengine, you must configure QATzip and QATengine on each host that the containers run. The QATzip configuration files are located at QATzip/config_file and the QATengine configuration files are located at QAT_Engine/qat_hw_config.
There are multiple versions of the configuration files optimized for different adapaters and usage scenarios. Select the ones that meet your adapter and usage pattern. Copy them to the /etc directory. Note that QATzip looks for NumberDcInstances and QATengine looks for NumberCyInstances. Thus you will need to merge the QATzip and QATengine configuration files together as you need both in NGINX.
For example, /etc/c6xx_dev0.conf might look similar to the following:
^ This from the readme on OpenVisualCloud’s QAT nginx container for Docker.
If you’re using kubernetes, of course, you can just follow Intel’s device plugins for kubernetes.
I Want To Use This
The Dockerfile has lots of cool heard-learned lessons in it:
… one can take a lot away from the openssl setup they’ve automated in the container here if you are contemplating adopting this for production
Testing
Writeup/TODO/ Paste youtube video here
Final Notes
So the one fly in the ointment here is that in 99% of use cases your webserver is NOT continuously tearing down and setting up encrypted sessions. The bottleneck won’t be how many TLS connections one can open, and then close. Real-world the web server will allow clients to keep the connection open for either a fixed # of seconds, a fixed # of requests, or some combination of parameters like that. This connection re-use is what helps speed things along. QAT is still quite useful for shifting load off of the CPU for both opening new connections and maintaining old connections, though.
Real world? It’s going to depend on your workload. If you’re running a microservice then you might actually see 90% of the benefit I’ve shown off here. Running a bog-standard webapp? You could still easily squeeze in +20% more connections over what you would be able to do without QAT alone.
In my mind the real problem with QAT adoption is that it needs to be ubiqitous and that it needs to be on more folks’ radar.