Tuning Linux memory vm options to reduce disk IO in cloud

parxysm79 · October 1, 2021, 9:43pm

I have a Java application running at clients that has a huge dataflow (many GB of source data). A particularly expensive step is collating data - combining data from multiple sources to get a complete unified view. Basically like a huge join operation but not using a standard DB - we rolled our own storage layer for good reasons irrelevant to this discussion. Anyway this creates large “bucket” datafiles. The maximum size for such a bucket file is ~2GB. These buckets are constantly mutated as source data is ingested, collated and written to their appropriate bucket.

We’ve found that when running on fast SSDs on our dev machines, we’re generally processing bound. However as our clients shift their workloads into cloud, we want to optimize the solution for this.

When the application runs in cloud environments, it burns through its provisioned IO credits very quickly (the disks are obviously much faster than what we have in our machines) and then the process can hang because the disk seems unavailable. I was surprised at how hard the IOPS cutoff is in AWS at least, it doesn’t trickle when you run out - you get IO Disk Timeout exceptions because the disk just stops responding, sometimes for minutes until AWS refreshes your credits. We can increase the provisioned IOPS, but memory utilization on the instance is fairly low so I’d like to leverage that before it throwing more money at the problem.

Can I use use vm.dirty_ratio or similar options so that memory absorbs the heavy IO load? My reasoning is that if the files are synced to disk less often, we won’t run out of those credits as easily. I can’t quite go full tmpfs ramdisk because I cannot predict how large the files will grow to.

I’ll need to run tests to see effects of options myself but any advice would be appreciated.

ThatGuyB · October 1, 2021, 10:36pm

Hot smoke, this seems like the perfect thread to shill for Linode ( https://linode.com/level1techs ). Get $100 credit using this referral link when creating a new account and test your application. Since you are in a business environment, you should probably use the IT department’s group mail, as opposed to your (personal or work) email.

I know. I hate AWS. Well, technically, they have a good short-term business model. It encourages customers to pay for higher tiers if they want consistent performance.

Anyway, I was about to suggest increasing RAM and doing a tmpfs for your datafiles. I thought it doesn’t go beyond 2GB and then gets flushed to disk, I guess you have multiple buckets, which is why you can’t determine the total size of the buckets.

I’ve used AWS in the past, but I am not that experienced. Maybe try to buy a plan with a guaranteed minimum IOPS, as opposed to a plan with lots of IOPS? Sure, your application might be a little slower, but it will be predictable, as opposed to fast for a few days and slow for the rest of the month (or similar).

Edit: someone on the forum is building a data center, so you could strike a deal with him? (@judahnator)

parxysm79 · October 2, 2021, 9:49am

I really like Linode for personal stuff but the company I work for and many of our clients are very focused on AWS. I don’t mind it, they are spending their money and AWS has great capabilities. I can’t push another provider on them. I do want know if anyone has experience perhaps tuning cloud environments for large DBs or other batch operations that are IO heavy. I’ll look into minimum IOPS as opposed to maximum burst IOPS - that sounds like more of what I need.

Am I barking up the wrong tree with the vmdirty options? If anyone even has a way to measure how many times a process actually writes to disk (not using memory cache) then I can use it in my experimentation and post the results here.