Storage thoughts

Have you tried asking chatgpt what you want?

You seem to want a complex system that you dont want to work for, maybe ask an AI to design and do it for you?

If everying else is too much work or asking to buy new stuff, keeping things in status quo is just as fine.

1 Like

Sorry but this is not at all how this reads. You ARE talking to a community of well informed people, many of which are professionals in these fields and have also gone through the EXACT same dilemma you are going through now.

You state your’re lazy and it dictates alot of your decisions on how you want to handle things. Yet you have a complex setup with a mix-mash of drives which makes it a hard thing to solve.

We also look at it from a data security point of view, having all these drives that you can’t do any sensible form of data duplication/raid, etc. is cringe worthy. Spinning rust is unreliable, especially consumer grade SATA drives.

By moving disks to a energy efficient NAS setup on a separate system makes a lot of sense from this point of view. It also gives you room to expand if you want to throw more drives in later without having to mess around with your desktop.

If you’re concerned about eWaste, then save some equipment destined to be eWaste and build a NAS. It’s very easy to obtain outdated servers for very cheap/free and configure them for this purpose.

If you’re concerned with network throughput because the data isn’t local, 1) You’re using spinning rust, throughput already sucks. 2) Your system is now having to manage the IO of multiple drives, and thus many IO queues. 3) It’s not very expensive to setup a 10GBe link between two PCs over Cat5e (yes, you can use Cat5 for 10GBe provided the run is short, I do this)

Edit: Sorry, missread your post, you’re using mostly SSDs

Having a dedicated NAS server gives you the option (especially if it’s recycled eWaste) of using SAS drives, which can be obtained used and very cheap.

At the end of all this, if you just don’t care and want the lazy option, just setup LVM and lump all your disks into one huge volume, but expect a drive failure to cause massive data loss.

Settle down, you asked the public for advice and you’re irritated it wasn’t exactly what you wanted. These people gave you some of their time to try to help you for no cost or expectation from you.

Take a step back and re-asses your attitude here please, it’s not like you paid for a consultant.

4 Likes

I’m irritated because they ignored everything that I said, they ignored my usecase, they ignored every followup reply that I made, and their every response was tangential and unrelated to the question I asked. As far as I’m concerned they derailed this thread and filled it with spam. Your yourself are repeating the idea that I should spend hundreds of pounds on a NAS as well as associated electricity costs when I have explained three times now that that would actively hinder my workflow.

I am going to write this in bold now because I am actually sick of writing it: I know what a NAS is. I have used them in the past when they were applicable to what I needed. They are not applicable to my needs, my work, or my living situation now. I do not need one. I do not want one. Please stop derailing this thread about bcache and BTRFS volumes to obsess over your storage server fetish

I am not the one who came here with an attitude, but my god have you guys been insufferably arrogant and unhelpful.

That’s not relevant for a consumer system that has no mission critical data and everything is regularly backed up. Oh noes I might have to re-download a game from Steam at some nondescript time in the future, whatever am I going to do… Once again you are completely ignoring the usecase of the person you’re responding to and condescendingly making up your own scenario.

My personal philosophy when it comes to data integrity in the consumer space is that as long as I back up everything I don’t want to lose – ideally in multiple places – then I really don’t care about compounding risk through things like RAID0 because that risk is mitigated. Far worse, imo, is people relying on RAID1, and then finding that their corrupted data has also been mirrored across both drives, or relying on BTRFS snapshots and rolling back, only to find that their drive fails and they can’t access their snapshot. There is absolutely nothing wrong with RAID0 if you’re managing data loss risk: which you very well should be doing anyway.

That’s what I currently do (as explained in my OP, it’s using a BTRFS volume, not LVM). This thread was asking for thoughts on two specific alternatives that will strike a better balance between performance and convenience. Neither of which anyone even acknowledged. All of this is in my OP. Do you seriously not understand the irritation of being talked at repeatedly by people who are utterly ignoring the question at hand, answering things that I never asked for a reason?

You guys have made up some imaginary person asking about storage and second guessed their usecase and completely ignored the real person asking a real question in front of you.

You are now not only spamming yourself but you are telling me to reassess my behaviour in the process. That’s hilarious. You don’t get to be this condescending and utterly ignore every single aspect of what someone is telling you and then get shirty with them when they inevitably get a bit irritated. I would have preferred if these people didn’t derail my thread with their irrelevant crap instead of wasting their time and mine preaching about NAS. Just because they’re doing it for free doesn’t make it a valuable use of anyone’s time. fwiw it’s not a kindness on their part, they’re doing nothing but attempting to stroke their own ego and feel knowledgable. I’d say it’s sad that they feel they have this much to prove, but given their failure to actually mention anything relevant to the topic at hand, it’s understandable.

Evidence is to the contrary: there’s been one person in this thread who has even acknowledged BTRFS. In the meantime I found some useful information on Reddit from people who were actually pretty knowledgeable about Linux filesystems and weren’t just trying to stroke their egos with very basic concepts like a NAS.

fwiw I did ask you to lock the thread instead of joining in the spam yourself.

I don’t want a complex system at all. I’m either going to make a large SSD bcache, or I’m going to just sort data manually into two or three BTRFS pools. It’s really not hard. I’m just asking what thoughts anyone else had on either of these two options because I haven’t made my mind up on which to go with, but no one seems to have bothered to read what the thread is even about. Like all the people going on about “solutions”. I don’t want a “solution”. I haven’t presented a “problem”. I have said “I am going to either do X or Y, do you have thoughts?” but no they just see “storage” and start bleating “NAS NAS NAS NAS” without even bothering to read further, and then wonder why I start to get irritated.

Frankly given the amount of reading I’ve done on filesystems, on ZFS, BTRFS, bcachefs, on tiered and cached storage solutions and various other things in that vein, it’s a little insulting to accuse me of not wanting to work for something just because I’ve spent a few days telling people to stop going on about a NAS and to consider answering the question I asked.

If you just have a specific set of rigid choices, you should have probably opted for a poll in your original post. In retrospect, you should have probably posted it on the poll thread instead.

And while I do get your frustration, I seem to be getting a hostile vibe around in your choices of words. The community does want to help. I do see your effort in refining your choice of words and I appreciate your patience with the community.

My partner did say I used too many words lol All the same, I wanted something more nuanced. “I used bchache in the past, and I experience xyz” that kind of thing. Maybe a qualitative statement about cache hits and misses from experience: maybe how many times you have to play a given game or whatever for it to reliably result in a cache hit. I have a theoretical understanding of that from research, but some experienced anecdotes are more what I was looking for here.

Any hostile vibe you are getting is purely because there are only so many times I can read, respond and dismiss the same thing over and over and over again before I get annoyed by it, when really everything is right there in the OP.

It’s honestly reminding me of around 2013/2014 browsing LTT forums and seeing posts like “I’m a noob I have $1000 to spend, do you have any advice” and seeing reply after reply after reply of $1500 systems, $2000 systems… Like yeah these people did that for free but… why? Who are they helping?

1 Like

This is a recurring question (albeit a low volume one) on the forum, the last iteration of which happened here:

where the best solution that we could find to provide a somewhat automatic tiered caching system over an existing filesystem was this github project

Maybe have a look and see whether it fits your use case?

1 Like

Btw, the project is from the storinator guys, so already worth checking and testing in my opinion … not the lab project for a thesis course that I would not be trusting my data (the only copy before backup in your case) with …

But your original question was:

You asked a VERY broad question asking for peoples thoughts on how to better manage your storage. This was not a question of which option would you select out of these, this was a “what are you thoughts”. Most people that responded clearly think that NEITHER option is optimal.

We did not try to explain to you what a NAS is or that you should use one, we provided it AS AN OPTION as it gives you much of the features you are looking for as you can setup a system from recycled eWaste and not only take advantage of the flexibility of the system, but also use a filesystem such as ZFS which can do EXACTLY what you are asking for BUT, it’s best to run it on a separate system with ECC RAM and enough CPU power to do what you want.

Stop thinking that people reaching out to help you is “stroking our ego”. How the heck you even came to this idiotic conclusion is beyond me.

You did not state this in your OP, don’t expect us to guess that you do have a backup mechanism in place already.

Which is why I was talking about real redundency in my suggestion.

Which is EXACTLY what you got, but because you didn’t like the answer, and you did not seem to understand the benefits of what people were suggesting you got hostile.

I got shirty at your moderation report where you insulted this entire community for being “incapable of being helpful or considering anyone’s actual needs before spouting off at them.”

So just because I might have some input to your original question when I see this thread means I can’t respond simply becuse you made a report? What makes you think I do not have an interest in helping you? What makes you think I don’t have many years of experience solving these issues as a professional in this field?

No, it’s not, and you asked a very broad question to a community of people of all walks.

Many in this community are server administrators managing equipment in data centres, we have access to equipment and technology that grants us knowledge you may not be aware of.

When reading your question (even to me), it make me think of a solution to your problem that would be acceptable to most, including the consideration of data redundancy AND performance.

Because this is NOT the question you asked.

You rejected a NAS on the basis of equipment and electricity costs, so I attempted to provide you with options such as recycling old server gear and building an energy efficient system. If you don’t find this acceptable, fine, idea rejected, lets move onto finding some other solution.

We are NOT trying to convince you to do anything! Only to provide you options you may not be aware of or have not thought of.

3 Likes

Thank you for your response. I remember looking at FUSE a while ago, I wonder why I stopped looking at it. It is definitely the ideal for my hardware, if not the easiest to implement.

That is not broad. I gave two options and asked for thoughts on them.

That’s kind of a misinterpretation tbh. ZFS has certain error correction functions that are essentially useless if you don’t use ECC RAM because if the RAM corrupts the data there’s nothing ZFS can do it fix it. Running ZFS without ECC essentially doesn’t have these features, but neither does any other filesystem, so it’s no less stable than using a different filesystem at that point from that viewpoint. In any case using ECC RAM to watch films over the network just seems silly.

I’ve already mentioned this, but if I were going to have a storage server this machine would be it. Depending on what I am sharing I’d either use Samba or SSH to access data, I wouldn’t get another machine. I’m sure someone can count how many times I’ve said a NAS would not be applicable to my situation.

My record for a 4K 60fps video file before transcoding them down is 2TB. This wouldn’t fit on any one of my SSDs so without JBOD/RAID0 I would have to use my massive spinning rust drive (and, in fact, did at the time). Doing this over a 1gbps network would be painful. Mechanical drives are never ideal for this, but sometimes the local overflow is necessary. The 10gbps option isn’t really on the cards because I’d need a new router/switch and because due to the chungus graphics card I have, in addition to the capture card I don’t have the PCIe space for a network card so I’d be looking at a new high end motherboard.

So why did you second guess? I included everything that was relevant. It was your wild filling in of the blanks that’s at fault here, and deciding my usecase for me that is the issue, not that I didn’t say that I back up regularly.

No it isn’t. If you want to see what a helpful, on topic response looks like, read what MadMatt said above you. I didn’t like the “answer” because it had absolutely nothing to do with my question. Why are you still going on about this?

it was the truth. Now you’re arguing with me about what my needs are, as if the point even needed to be laboured.

lmfao. I shouldn’t laugh, but I asked a question about caching and filesystems and I got three days of “buy a NAS”. Should I be offended that you think I “may not be aware of” this? In anycase “this is a good solution in the datacentre, which is clearly your use case. How dare you suggest that I am not taking into account your usecase” is a response that requires some introspection, I think. A big problem I have had in this thread is people, including yourself, repeatedly ignoring my responses about what I actually need my system to do.

I don’t have a problem though. Where are you reading that I have a problem? I have said many times what my use case is, I am thinking of two possible ways of managing it, and if anyone has any considerations about these. All I’ve had is “buy a NAS,” “throw all of your stuff away and re-buy modern equivalents” and (most helpfully) “buy a time machine, go back to 2012 and buy some 2TB SSDs”.

It literally is.

Why did it take three days of me rejecting it to finally get you to relent on this derailment? Why was the first “no” not enough?

Are you dense? This literally states you are not sure and you don’t know. This leaves your question WIDE OPEN to suggestions. You have absolutely zero rights to be upset when you ask a question like this and people throw suggestions at you.

No NAS, sure OK, but you gave reasons that are potentially solveable, so people WILL try to solve them.

Did you even read my post? I stated ECC in a NAS :man_facepalming:

You do know you could even run ZFS locally on your PC right? This is also an option if you ware willing and able to run ECC, and willing to use enough RAM to accelerate things. Again, this is up to you, and only offering IDEAS, not telling you what to do or how you should do things.

Fair enough, but just for the record, you do know you can just do a point to point link without a switch right?

I did not second guess, it’s not even a second throught to me that such a solution would require data redundancy. Running without it is a dingus move in my field of work. I am so sorry that I was concerned for YOUR data and I was not willing to provide an first option that would risk it.

What I recommend/suggest reflects on me and my reputation as a server admin. What do you think anyone that trusts my opinion would think of me if I started suggesting as a first option things that were dangerous and could lead to massive data loss without first explaining the dangers?

Personally I run RAID0 in my VM for the same reason as you, games can be re-downloaded. Don’t think I am insistent on such technologies when there is no need for them.

You directed a response directly at me, I have every right and reason to respond.

Then go find another forum to be a member of if this entire community is incapable of helping you.

Why would you be? I don’t know what your level of experience is, this was not an insult, but an explanation as to why I offered what I did. Stop being so sensitive.

omg, you are laughable. I never once suggested this is the case. If you want to read into things in such a way then go right ahead and keep living in fairy tale land.

A big problem is YOU THINK you were ignored. Providing alternative solutions that you may have overlooked is NOT ignoring your requirements. If you reject what is offered, fine, no harm done, and no need to get snotty about it.

We do not know you, we can not read your mind, we can not know what your level of experience is, nor what technology you are aware of or have been exposed to. Prompting you to consider alternatives may yield a better solution for you then you had initially thought of. This is how progress is made.

It’s a way of asking for someone’s opinion about something. My opinion is that your options both are not optimal and you should approach the issue from an entirely different way if at all possible.

You do know that I only posted ONCE, last night, and after seeing you do not find my additional suggestions helpful I accepted this and moved on.


Edit:

Note also my suggestions:

All are “If you” … there are NO assumptions about your needs or requirements here, they simply addressed your prior rationale for rejecting a viable solution most would be happy with. I was NOT telling you to get a NAS, I was providing potential solutions that might make a NAS a viable solution for you.

3 Likes

I always find the best results are obtained when asking for peoples free time and then arguing with their advice…

Different size disks = just JBOD it.

You’ll have no resiliency but if you’re not interested in right sizing the hardware and repurposing stuff that doesn’t fit an array that’s about all you’re going to get.

Pretty sure even windows can do JBOD via just adding disks to a dynamic disk.

5 Likes

Actually went back and re read the op again; I had seen it earlier and anticipated the dumpster fire. Because of the unrealistic and conflicting goals.

I’ll add this

You say you’re lazy. Well here’s a tip

Tiering is bullshit and more trouble than it is worth. Even enterprise vendors have mostly abandoned it because it’s shit and doesn’t work.

You’ll either spend time micromanaging your content manually shifting it around, time trying to tune auto tiering and/or wasting bandwidth migrating content etc.

Just don’t. Keep it simple.

JBOD, or get some sort of sane set of drives (anppropriate for your level of speed/redundancy) and donate the ones that aren’t suitable to other people or systems.

Otherwise you’re just trying to force square pegs in round holes.

As @gnif said a bunch of people here including myself are paid to do this stuff as a day job and have been for years/decades.

We’ve seen, made or repaired mistakes, been burned by failure and will point out bad ideas as bad ideas.

You’re free to take that for what it’s worth or not - but arguing with people because the demonstrated reality or informed opinion doesn’t match your expectations, or blaming others for making incorrect assumptions about your shifting goalposts isn’t going to help.

4 Likes

Is it that bad? I am very similar in mindset to OP and have collected a similar setup over the years.

I have never experienced any mental load or consternation over where I put my data.

@cakeisamadeupdrug
I understand you. I had a similar experience in another thread. The people here that are professionals are entrenched in their own little enterprise world and it’s difficult for them to see how the solutions they’re used to do not apply to normal consumer use cases.

But my first question is, where the heck are you even putting all these drives? Aren’t you running out of ports on your MOBO? :sweat_smile: (That’s a situation where you’d need to consider something like a NAS out of sheer inability to plug more drives in your current machine.)

I am new to linux myself and just learned about bcache from this thread. It looks cool.
Based on your drives I’d probably just consolidate each technology on its own and call it a day.

If you were playing games or doing work that did not involve editing massive files I’d maybe consider putting the small 256G drive as a bcache for the HDDs, but from what you’ve said that seems to not apply to you.

So I’d basically have NVME volume, SSD volume and HDD volume and be done. But then again I have no trouble with organizing my data and if the extent of your laziness, as you put it, reaches beyond a short consideration of “do I want nvme, ssd or hdd speeds for this piece of data” then the best advice I think I can offer is “just jbod everything” and go out and have a coffee or beer with some friends using the saved time and effort.
P.S. I think you said you already have a satisfactory backup solution, but if you don’t, with so many drives I’d consider dedicating some space to serve that function.

1 Like

That’s a pretty big (and idiotic) assumption to make. For home use I am all for making best use of what drives I have available based on my personal budget and energy costs. Just because I work professionally with enterprise gear doesn’t mean I snob off the cheap options for non-mission critical scenarios.

My home office is a hodgepodge of recycled/used/repaired gear. I have a plethora of HDDs of varied ages, sizes, types, brands, even going as far back as IDE and a VERY old MFM drive. Most are still in use.

Get a used SAS controller for $30 and use a breakout to give you a ton of SATA compatible ports. A NAS is just a computer configured to share it’s storage out. The advantage of a NAS is in the offloading of the management of that data to a system dedicated to it.

Based on my 30+ years of experience from home PC usage and support through to what I am into today, working with every storage solution you have ever known plus some you likely have never heard of, what the OP is asking for here is just asking for problems, which my personal AND professional experience has taught me, the hard way, more times then once.

3 Likes

Enterprise versus home is nothing to do with it.

Cheap (either $ or time or both)
Fast
Reliable

Pick two
You can’t have it all.

I’d you actually care about keeping your stuff, choose something that works. Not some janky half assed setup that will just add complexity and not actually be reliable.

If you don’t and you’re lazy JBOD it.

3 Likes

All I can do I sigh and wish it were an assumption and I hadn’t been repeatedly told “spend more money so you can deploy the enterprise solution”.
(By other people. I have nothing nothing against your persona. First time I even interact with you.)

And not having a setup is probably the least janky setup, as it were. I think that’s what OP is going for. :slight_smile:

But yes. Pick 2. Except sometimes you don’t get to pick. It is picked for you.

2 Likes

This IS an assumption, do you think that just because we use this gear in an enterprise environment where our bosses pay for this stuff means we can afford it at home too?

We are all in the same boat my friend, the difference is that the exposure we have had to these systems has helped to inform us better then general consumers such as yourself.

Do I wish I had the $$$ for a big budged multi petabyte ZFS system at home? Hell yes!

Do I have one at home? Yes, but not big budget, certainly not > 10TiB, and certainly not high performance. Only after many years of janky setups like the OP here and then gathering enough cash to build one out of old/used parts I sourced from eBay/AliExpress, etc. It’s still not optimal, but it’s better then what I had.

Why was I motivated to set one up? because I lost data without even knowing I had done so due to bit-rot. It was not because I thought I should have one, but because I lost data that was important to me and my backup solutions did not and was not able to detect this bit-rot (note, most are not).

I have read your other threads on here, I can see that you are just starting out with your first NAS type setup and I can see that you are also doing it on a budget with old recycled equipment (like most of us).

You like the OP here asked for advice on how to best manage the storage you have available, and when given answers you don’t understand/agree with, you can’t see how they come from years of experience of making the same mistakes you are about to make here.

Note, I personally do not like ZFS due to the lack of user friendly recovery tools, you need to know it’s design back to front to recover data from a fully degraded pool. However the only time I had a degraded pool was due to running ZFS without ECC because I couldn’t afford it, and on old SATA HDDs, because I couldn’t afford better. And I didn’t have a UPS, because I couldn’t afford one.

So stop assuming that you’re being told what to do. Nobody here told the OP what to do, but rather suggested a better solution to his problem and instead of just simply saying “no, I do not want this”, he gave reasons for not wanting it that in the eyes of others here were solvable problems and worth mentioning as such.

These days there is NO excuse to not use enterprise grade equipment for a personal/home NAS if you’re investing in hardware to build one, as it can be had used/cheap on flee markets as providers do not like to recycle hardware for mission critical applications that may cost them millions in liability if things go wrong. Mine literally cost me < $200 for a 1RU Dell chassis with a 12 core Xeon and 32GB of ECC ram on eBay. As a bonus it also became a VM server and performs other tasks too now.

Several years later instead of needing to rebuild this entire system I upgraded by throwing in a 10GBe NIC (also used) and a SAS disk shelf I got at a govt. auction for ~$100. I can now scale this system up to 16 disks, using cheap used cold spare (essentially new old stock) SAS drives you can find on eBay for $40-$120.

Last edit sorry:

There are good reasons why I keep saying SAS and not SATA too.

  1. SAS drives have a FAR longer MTBF (mean time before failure) as such used drives are perfectly fine for non-mission critical applications and can be had very cheap because no professional in their right mind would deploy one for such use.
  2. The consumer SATA link itself is guaranteed to have undetectable read errors at a rate of 1 per 10TiB of data transferred. Enterprise SATA is 10x better at 1 per 100TiB of data transferred, still not great. Where SAS starts at 1.1PiB before a single read error is to be expected.
  3. Most people can’t use a SAS drive and as such it keeps the price of used SAS drives low. In fact, I have bought them and often I get brand new old stock that was just spares some DC had sitting on a shelf somewhere for servers they have now deprecated.

A good read: SAS vs SATA | What Is the Difference & Which Is Better? | ESF

3 Likes

Ditto.

It’s a case of working out what i ACTUALLY need to keep and making that reliable.

Have a couple of hundred terabytes at work, but home is 4 TB configured to not die.

Oh I still have a bunch more SATAs to use, and I have a PCIe m.2 card too that I’m not using from back when I was using an m.2 drive on a board that predated the standard.

lmao Clearly it’s too much to ask to make a thread with a post outlining what it’s about and expecting people to not derail it with things that are completely unrelated. It’s a thread about bcache, dear. Just because you’ve shown up at the wrong stadium doesn’t mean that I have moved the goalposts.

Anyway as an update for anyone who actually is interested in the topic at hand, I went with option 1. I made an LVM that encompassed my SSDs, I made a second LVM that encompassed my HDDs, and I set up bcache in writeback on the SSD LVM to use the HDD LVM as the backing device and built my BTRFS filesystem on top of that. So far it’s working very well, giving me the convenience of my previous JBOD array without the speed penalty of striping across a mechanical drive. Cache hits, ie the main concern I had going in, seem to be a non-issue. Benefit of having such a big cache, but for now the speedup has been dramatic. I’m intrigued to see how and if this changes at the capacity fills up and how intelligent the caching is. For now at least, it seems more intelligent than my experience of Intel Optane’s Windows driver.* The only downside I have encountered is that, being on a rolling release, I have to wait a couple of days when upgrading to a new kernel sometimes in order for it to boot. As an Nvidia user, this is something I was used to anyway.

fwiw I cba to scroll up to see who it was telling me to use JBOD, if you’d read the OP you’d see that that was what I was already doing, and the speed was fucking painful for daily use, video editing and gaming. The bcache is much better.

Why are you now insulting me because my topic isn’t about what you wanted it to be about?

If you really want to talk about a NAS so badly have you considered making your own topic on the subject? It’ll be a lot easier for me to ignore than derailing my thread about caching.

I have on-site and off-site backups for anything important that I care about keeping. I don’t need to be losing my hair over panicking about the risk of data integrity of my Steam game folder. If I had a drive eat itself it’d be an inconvenience at worst.

  • semi related, but I did end up taking the 2TB SSD out of my hand-me-down laptop and putting the original 1TB SSHD back in it. It has a 32GB Intel Optane drive and I’ve set that up with bcache too and even with a much smaller cache it’s working out pretty impressively imo.

bcachefs seems like a promising solution to put many of these questions to rest. Bit rot protection, encryption, compression, and (read and/or write) caching all using a bunch of heterogeneous drives—LVM not required.

The wait for it to be included in the Linux kernel might be a while, unfortunately.