Well, I hope it does not use sync for everything.
With NFS, the whole story changes again and it defaults to sync disabled.
Anyway, you could bypass this problem by setting the borg share to sync disabled on TrueNAS.
That way nothing can be written in sync. Only downside is that if your system goes down, right after a backup finished, the roughly 5s txg will be lost.
For a backup that does not do incremental backups or something that checks for integrity, this should be a none issue. Not sure about borg.
In that scenario, you don’t need SLOG, since nothing ever will touch SLOG.
Please report back what borg tells you.
Advice:
I would not deploy core in 2025. AS per above the future is scale, and whether or not somebody is going to fork it, I’d stick with what IX systems are pushing so that you can continue to get paid support if ever required.
Also, I would not run anything else on this box. use it as a dedicated storage machine, using it for other things like running VMs or other stuff is just going to make it more fragile and as a backup location that’s probably not what you want. You’re also inevitably going to end up with something requiring IO throughput that will interfere with your backups. Based on job experience: let backup be backup and keep it as simple as possible.
Also - as above more RAM = more cache. Cache can only give you so much however, and eventually the writes and reads need to hit the disk. Cache is not some infinite speed magic and being mostly write heavy (as you back up more often than you restore) ZFS will be flushing to disk every 5 sec (from memory).
Being backup, it isn’t so performance critical however and with only dual 10 or 25 gig NICs and the fact that ZFS turns all writes into sequential, you’re probably going to be network bound with any sort of serious disk layout (such as dual RAIDZ2 or dual RAIDZ3).
I don’t think I can get paid support in the first place since I’m building my own machine. They offered to make me one, but the price would have come out way higher than building my own.
I’m just going to configure distcc or icecream to use it if the machine is idle. Probably going to take a decent amount of time on my part to figure out how to spread the hardware across both CPUs for that. I might just use the second CPU for that if the first is more than adequate.
I just care about restoring fast. Currently, my backups take about 6 minutes each. Currently, restoring files is way slower.
The basic idea is max out ram to get as much ARC as possible. For L2ARC I’d want NVME drives, ideally Optane? There’s a tool for figuring that out, right?
For storing the ZIL on a SLOG would a SATA SSD or should I stick with Optane for that? I’m probably going to add a SATA SSD for swap space. But, I expect to almost never hit that.
Other than that I’m eventually picking up a UPS and some solar batteries in case power goes down for multiple days, and I still have cellular internet.
I’m sure as I add more machines to my network I’ll repurpose hardware and make each machine more specialized for specific tasks.
A point of note here. ARC only caches reads and is what people are referring to regarding more RAM. The ZIL isn’t really cache. It’s basically a journal to recover from in case of power loss, etc.
EDIT: this is why I’m always leery of people talking about cache drives and ZFS. There’s such a big difference between how ZFS works and what people assume. They’re generally picturing/wanting something akin to Unraid, etc.
iX offers paid consulting services which can both help you design your system as well as support it. Not sure if it’s a plan or pay per incident/time.
As noted, the more you have a machine do, the more complicated things get. But I’m curious. What are you doing that you need so much compilation?
This is why I said previously that you need to figure out your existing setup and it’s bottlenecks. What exactly is the restore process and why is it slow? Is it actually a storage or network problem or something completely unrelated? If the backup server is hitting a limit, changing out the storage or network would give you no benefit.
The fact that you’re asking those questions shows how much you don’t understand how ZFS works. I would highly recommend the iX consulting services I mentioned above as they can take a look at your actual setup and help you design a customized solution that will do what you need.
The goal of L2ARC is to provide additional space for the ARC to spill into. As things get evicted from ARC they will land in L2ARC. This means that you have to spend RAM to keep track of them, which reduces what’s available to ARC. Whether this helps or hurts depends on your workload. L2ARC should be faster than your pool but will be far slower than your RAM.
ZIL is always written. It’s there to provide recovery from power loss, etc. SLOG just moves it to a device that’s separate from your pool. It’s never read unless there’s a problem. Attempting to use a SATA SSD for a SLOG will bottleneck your writes and be worse than not having one. Optane is what you want, but again, depending on your setup, may or may not provide any benefit.
Why are you concerned about swap? Are you planning on making changes under the hood? AFAIK, TN does not provide any options regarding swap location.
Have you measured your existing power usage? Providing several days of power for this much hardware is likely going to require a lot bigger solar and battery setup than you realize. Is your power that unreliable? Do you need to have everything up and running the whole time?
As I said before, I’m really curious what you’re working on that has led you down this path.
I don’t really see the point of paying that when there’s a bunch of benchmarking tools out there, and internet forum posts to dig through. I’m more likely to pay iXsystems competitor that forks TrueNas. If it is Deciso I already get very good support from them for free.
I might pay one of those two in the future if I need to get a system back up in hours versus a couple of days. But, that type of support is extremely expensive and not needed at this point.
I write c++, and I do not want compilation taking more than a second with my work flow. Ideally, 0.1 seconds. I change a line of code, I compile. Change another line, I compile. I want the rapid feedback of whether a code change is correct or not. Sometimes I have to wait 7 seconds, and that is way too long. I’m using Chromium to benchmark, or another large project, which is known to have long compile times. Those projects take 30 seconds to 3 minutes to compile while editing code, which is forever to get rapid feedback over whether a code change compiles or not.
My existing setup is an external HDD connected through a USB port. The length of time to restore is most likely the HDD. Getting reads from an NVME drive would be ideal, while storing data
Well, yeah it’s L2cache, like you find on a cpu.
I thought the idea behind SLOG was dumping the data to faster medium to optimize write speeds to the slower HDDs. Not sure, if I need that which is why I’ll be running the various benchmarking tools to figure out if I get a benefit. From the sounds of the documentation SLOG on SATA SSD should speed up writes from not double writing to the HDDs. But, of course Optane and NVME would be better for this.
I usually add swap to systems to avoid OOM edge conditions. I usually spec it out to be twice my ram size. It’s more of a habit.
Yeah, I need a 24/7 internet connection. I lose money if certain devices go down. This is way out in the future and beyond this machine.
So, it would still be helpful for my media server? From reading through forum posts it sounds like it can help with backup writes and reads, depending on workload and hand tuning. That’s why I’ll be running synthetic work loads to figure out if I need it. I do need to figure out how borg compares to rsync for writing data. If it’s anything like rsync than the L2ARC would help, as that helps rsync according to this.
If it can be tuned to cache all of the data on my HDDs, and RAM won’t already be caching it then I could see a use case for sure.
By the way, True NAS CORE is receiving support for another seven years because of corporate contracts iXSystems has. That’s why you’re not seeing any forks popping up quickly.
Don’t take this as an insult, but it helps if you are clueless like you are. That will even be cheaper in the long run.
TrueNAS is just a Debian with ZFS and some nice NAS GUI.
Decisio is a Firewall based on FreeBSD.
If you compare these two, you don’t understand anything at all.
The idea of SLOG, is to move the ZIL from the pool to a separate device.
That device should be able to write sync fast.
Only devices that are able to write sync fast, are PLP SSD and Optane.
ZIL is only used for sync writes, never for async!
So sync writes, even with an unlimited performance SSD will only ever be able to achieve parity with async write performance!
If your workload is async, the ZIL or SLOG will never get used!
Well it depends. Rsync probably list metadata. If that metadata is in ARC or L2ARC, yeah it gets faster. The advantage of special vdev is that not only metadata reads are faster but also writes. And that the data does not need to become hot. Downside is that special vdev unlike L2ARC is critical for your pool.
I’m a FreeBSD nerd from way back (like… 3.0)… your assessment of this is based on what exactly?
Gut feeling, actual CVEs that have been posted or…?
Also… compiling stuff, etc… we building a backup box here or a compute box?
I agree with posters above, figure out what you want and get advice on that specific workload - sounds like you are just throwing a bunch of resources at what you think might be a problem based on forum posts when in actuality a lot of this hardware you’re planning to use is irrelevant for your use case.
Again, rather than throw a heap of resources at this box so it can be a jack of all trades; I’d split non-backup workload to a seperate box.
You want to back up 100 PCs - how much data in total, how often, differential or full copy every time?
Most of this hardware I have to get anyways. I’m just trying to plan out the build a bit before planning for the additional drives on top of the HDDs I’m running.
Deciso and their community members are who is forking TrueNAS Core. They’re already maintaining FreeBSD. It’s just a different GUI on top of it.
I probably don’t need a SLOG. I still think a SATA SSD would speed up writes, but not be as optimal as optane or any NVME drive. When I first started looking into FreeNAS years ago SATA SSDs were used for the layer above HDDs. The only way I’d get any benefit from a SLOG is if just having one leads to my work load creating more contiguous writes on the HDDs.
I’ll probably end up needing a special vdev for metadata / small files. Maybe a L2ARC might help if sized to keep a copy of all of my data in it. Other than that some mirrored NVME VDevs might help with performance. But, all of this stuff I’ll figure out after getting my HDDs up and running and testing with various workloads.
Combination of Gut feeling, and the BSDs staying around for networking equipment. They’ll probably always have a niche use case. There’s also the fact that Debian and Linux in general is starting to have problems with communism / socialism, while the BSDs stay out of politics. I think everyone getting banned from Linux communities for saying slightly offensive things might navigate to the BSDs.
It’s mostly a backup / storage box. It’s probably going to take me five to ten years to build out to 100 computers. That’s the most Sensei supports for the home license. I’ll be building more specialized boxes during that time when it makes sense. I might turn this into an IPFs box by that point.
I’m going to have extra cycles left most likely in the beggining, and I’ll use the computing resources. It’s not that hard to load balance a home setting vs corporate. Currently, I only have three computers that are going to be consistently writing to this. Each of their initial backups shouldn’t be more than a half hour, and more likely closer to six minutes. So, there will be times when the CPU is not getting used.
I’ll have to look into Icecream, but I believe that’s fairly easy to balance loads for, and it won’t even touch this machine if a bunch of heavy writes or reads are going on. If I was in a corporate setting administering this would be hell. But, for a home or small office setting it should not be any worse than ricing a system.
The only problem with that is it would probably cost me more in the short term. Ideally, I’d probably have like five different servers for all of the different work loads, and some not having access to the internet. Instead I’m opting for this being a repo cache, media storage, backup, distcc box, and other. I’m mostly focusing on performance for the backup since that takes most of the work load, and every other purpose is secondary.
I’m trying to keep my max budget at $30k, while shooting for $20k. Most of that budget is for adding higher level storage if it proves beneficial for the work loads used. If I started speccing out all of this as separate machines I’d easily be spending $50k or more.
If this does become an IPFs box within a couple of years being a jack of all trades would help for unknown data access. If that works eventually I can run specialized nodes to bench which brings in more per dollar spent.
What kind of migrations are you doing? In place upgrades, new machines, reinstalls? I still have a Core machine because it has a bunch of jails I need to deal with.
I have noticed that a lot of people have a similar attitude to the OP. This must be what having teenagers is like.
It’s kind of mind boggling their resistance to dealing with iX, yet insistence on TN. ZFS is the source of majority of the complexity and there are several places that specialize in ZFS consulting.
I don’t know if you’re going to get through to them. They’re apparently self hosting money making things requiring 5 9s and willing to drop 20-30k on equipment yet not willing to do invest anything in doing any of their goals correctly.
Nooooooo… I mean yeah, a pool consisting of SSDs speeds up writes.
You know what, forget it. Just read this flow chart:
What layer?
Wrong again. SLOG speeds up SYNC writes and sync writes only.
You don’t need L2ARC and special vdev.
If you can keep all your data in L2ARC, you should better use a pool on that drive and don’t waste any RAM on L2ARC
This is probably the most stupid thing I have read in a long time.
Ahh you are one of these fantasy land builders.
Thank god, for a moment I was worried that there is a company with 100 employees and such an overwhelmed IT admin.
If you plan on using distcc or similar, you’re likely to be bottlenecked by the network. Your money may be better spent on buying a faster workstation. I’d look at the 32 core 7970X Threadripper with its 5.3 GHz boost clock for the linking stage.
I have one of those 64 core machines. It’s fast, but can still be faster. It takes like 15 minutes to compile a full Gentoo install. It takes more time figuring out how to get custom packages compiling properly. I’ll probably upgrade my network infrastructure to try to get that down to a minute.
I’ve seen some empirical tests showing having a slog on a SATA SSD does offer performance improvements. But, anyways I’m just getting my 11 HDD ZRaid3 together first. This post has some people experiencing improvements in their workflow by playing around with SSDs. It basically sounds like it’s complicated and if you have the resources benchmark different setups. But, Optane NVME drives are almost always preferred. When I was looking into this initially ten years ago Optane did not exist.
I mean the special vdevs, l2arc, etc. How I understand ZFS is ideally your HDDs get hit last with the higher level drives sitting above the slow drives having data propagated from ram to them first. I need to build the initial system first and playaround with it. I’ll then be building the other drives focused on improving read speeds if needed. But, I’ll probably need to upgrade my network prior to that mattering.
From Google’s AI:
In the context of ZFS (a file system), a SLOG (synchronous log) is a dedicated area on disk used to accelerate synchronous writes. While it does improve write performance, it’s not limited to only speeding up synchronous writes. By offloading synchronous writes to a faster SLOG device, it can free up the main storage device for other operations, including asynchronous writes, ultimately improving overall system performance.
It might improve some part of my workflow, but it’s on the bottom of my list of things to get. It might be worthwhile for an IPFs system as you don’t know whether sync or async writes are coming in.
I have a large ZRAID 3 of 11 HDDs with 3 spare HDDs coming in for optimizing space and reliability. Are you saying I might get better performance by getting more HDDs, and mirroring those? Wouldn’t that impact the reliability?
I’m just saying some of the data stored on there I want to keep higher up on fast storage mediums for accessing and writing it if it improves performance in benchmarks. I’ll be benchmarking before I add any of that. I can always blow away the data I’m storing and restore from my old back up solution, until I fine tune it.
Well, there is an Arch Linux plot to murder Bryan Lunduke. Long time influential devs have been getting banned in major projects over the past five to ten years. I think it’s about 25% of open source devs. So, they just move to FreeBSD.
No, even when my homelab has 100 computers in it I will not have any employees. I make a living by trading stocks, and toss the profits into building out my home computing infrastructure. At the most, I might have 19 other partners building out infrastructure for trading. But, I need a windfall profit coming in before that happens.