Storage thoughts

Two facts about me

  1. I don’t throw things away unless they are actually broken
  2. I am lazy and dislike micromanaging

This has led to me having the following storage devices in my main PC

  • 2TB WB Black HDD from 2012
  • 256GB m.2 NVME SSD
  • 2TB m.2 NVME SSD
  • 512GB SATA SSD
  • 1TB SATA SSD
  • 12TB HDD
  • Probably going to add another 2TB SATA SSD because it’s wasted in the laptop it’s currently in

Currently I have all of this JBODed together in BTRFS because lazy (nb point 2 about me). This appeals to my laziness, but it’s not really worth the speed penalty caused by striping across two spinning rust drives, especially when gaming is involved.

Ideally what I’d really like is a three tiered storage system where data is migrated from HDD to SATA SSD to m.2 as it’s used, and demoted when unused. This doesn’t seem easy to achieve, however.

I’m kind of meandering between two thoughts of what to do with these drives, therefore. I could create an SSD volume, a HDD volume and set up the SSDs as a massive bcache. This would more or less achieve the same ends as tiering, but it would cut my total amount of storage down by about 5TB by duplication, which is unpalatable. It also wouldn’t really leverage the speed of the m.2 drives without additional micromanagement.

The other thing I could do is create SSD and HDD volumes and just manually put things like games on the SSDs and spreadsheets and music on the HDD, but again see point 2 about me. At least this would stop things that are frequently accessed but don’t benefit from bandwidth (such as music) being moved to the SSD, if that matters…

I don’t know, what are your thoughts on this?

1 Like

Uninstall the games you don’t play and ditch the HHD. Thats my suggestion, how many games do you have installed that you havent touched in 1 year (probably a lot)

256gb NVME = Install OS
512 Sata = Apps / doc storage
2TB Nvme = Steam/Origin/GoG etc + Main Game Library
1TB SATA = Overflow games
2TB SATA = Overflow Games 2 (If you move it)

My desktop is
512GB nvme OS + non gaming apps
2TB nvme Games

Have a NAS for Movies/Doc storage

1 Like

But… why? Why should I spend a load of money on an entire second system, as well as extortionate electricity costs, just to move my 3TB film collection three feet to the left of where it currently is?

None.

I mean there are many reasons. I didn’t suggest you have a NAS I said I do, well its more then a NAS but yeah. You also don’t have to waste money on an entire 2nd system just wait for an upgrade, you also don’t have to run it all the time. I also don’t know how many users you have at your house / room / apartment that would use the media etc.

I could see where you thought I was saying you should have a nas, I was referring to myself as an explanation to why such small desktop storage

Cool, pretty impressive to be honest.

Ah apologies. I interpreted what you said as “ditch the HDD and put the films, docs and music on it on your network instead”.

I’ve done that before at my parents’ house because that media needed to be accessed from lots of different PCs. I don’t see the point in doing that if you just have the one PC though, and I do need lots of slow storage for HD and 4K video content. Honestly if I found myself in that position again I’d probably either set up a Samba share to the content on this drive, or just use SSH and this would become the NAS.

It’s the three levels of overflow that I don’t like tbh. I’ve done that before. Originally the big m.2 was for video editing, because I was getting a storage bottleneck when it was on a HDD, spanning a volume across all of the SSDs would utilise all of my fast storage and not require me to manually put things in certain places, especially when it gets a little more complicated with various launchers running in Wine.

1 Like

we are talking about storage on a single desktop PC here?

I agree with the sentiment of putting the big storage on a dedicated network device. Put all the mass storage there. If you need the files, copy them back to your local system over SMB or ssh, as described.

I generally maintain my PC’s with only a boot drive, and a local storage drive. Something like 500GB-1TB boot SSD for the OS and programs and home dir, and 2-4TB storage SSD to put things like game installs. Sometimes if the boot drive is big enough I dont even bother with a secondary drive and leave it all on one.

I think the “multi level storage” situation described here is completely pointless. Get rid of all the old drives, do a secure wipe then recycle them or eBay them. Consolidate the mass storage on the network and only keep enough storage locally on your PC to hold the data you are actively working with.

Having a network storage is useful even if you have “only 1 PC”, because it lets you separate out the functions of “store lots of data” from “daily driver PC”. You can do whatever the heck you want with your PC and not have to worry about how your mass storage is being maintained.

1 Like

Why though? Why do I need a separate PC just for holding films that I am going to watch from the PC they’re currently in? People (not necessarily on these forums) repeatedly just tell me to do this but have not once actually justified the cost or effort. It also means that whenever I do need the space for video editing overflow I have a massive increase in latency and a huge bandwidth bottleneck in my rendering pipeline for no discernible reason other than to clutter up my hallway.

They do only hold data that I am actively working with, and you’re just telling me to create ewaste for the sake of it.

ngl if I’m asking about tiered storage and caching, telling me to throw out an important part of my workflow just so I can spend a lot of money I don’t have on a new system build and electricity costs in year of our lord 2023 is really unhelpful. Not to be too insulting, but this is a very Linus Tech Tips reply to a Level1Techs Linux filesystem question.

There is no solution. You don’t have drives with equal size, all stuff have varying performance characteristics and you don’t want to waste space and you are lazy and you don’t want to do anything. These things are mutually exclusive.

Striping on a BTRFS volume is probably the best bang you can get, just because BTRFS doesn’t care about different sizes. Leaves a lot on the table and I wouldn’t bother with this personally.

If you buy storage that doesn’t work well together, don’t expect magic software to fix it without drawbacks.

You took the wrong decisions when buying stuff in the past. If that 256GB drive was a 512GB drive, you could md-raid it and do some nesting shenanigans with the other 2x1TB drives you didn’t buy in favor of 1x2TB to get a really fast 3TB device.

But if the main driving factor when buying stuff is to have a single drive of every possible size and repeat it for each performance bracket, you maybe win on a storage-bingo card but it isn’t useful in any other metric, including reduction of e-waste.

Buy equally sized drives and plan ahead for future consolidation. This avoids e-waste the most and still retains benefits later on.

3 Likes

They don’t have to be equal sized drives. I already mentioned that I’m using BTRFS and you already said that this doesn’t require equal sized drives. The storage works perfectly well together.

No I didn’t. I bought what fitted my usecase, and continues to fit my usecase.

No, I’m literally just asking for opinions on bcache vs manually sorting volumes. Getting snotty with me because “throw away all of your drives and buy a NAS” doesn’t help also doesn’t help.

No, that doesn’t fit my workflow or usecase at all. Throwing away perfectly functioning hardware to reduce e-waste doesn’t even make sense. I’m going to use these drives until they fail.

“Solution”. It’s not a problem, I asked for thoughts on two different approaches from people I’d assumed might know more than me. They are both adequate “solutions”.

I must say that your thoughts are even less helpful since they not only tell me to buy unnecessary hardware, but they also require a time machine to do so up to a decade ago.

I think I’ll play around with bcache. If I end up not liking it for whatever reason, I’ll just do the other thing. I don’t like cacheing per se; I’d rather migrate the data rather than duplicate it, but it’s the way all of these solutions gear towards so eh.

I’m kind of disappointed: I thought this community would be more knowledgeable about BTRFS, bcache, lvmcache, bcachefs, tiered storage, ZFS and Linux/BSD storage solutions more broadly tbh.

The magic solution you are looking for does not exist. If it did then you likely would have found it long before you came to the point of making this thread.

Trying to shoehorn your random mish mash of parts into a heterogeneous storage arrangement is a lousy idea.

Consolidating down to fewer, larger, focused storage configurations that are potentially separated into independent systems is advantageous. Throw out, recycle, or eBay all your disks less than 2TB. Put one or two “large” 2TB or 4TB M.2 NVME SSDs internal on your PC, put a larger mass storage volume of 12TB+++ on your network potentially in a RAID1 configuration. Futzing with tiered storage, btrfs, etc., is a waste of time for you here. Keep it simple.

2 Likes

Once again, I’m not looking for a “magic solution”. I am considering two things, I asked for any thoughts on either of them. So far all I’ve had is people like you obnoxiously ignoring my use case and repeatedly telling me to spend hundreds of pounds on a network solution I don’t want, need and have repeatedly said would not be applicable to my workflow. Jesus fucking christ, your solution is to throw out my boot drive and game drive? That’s just stupid.

I am actually quite irritated by this now. Post something on topic, or don’t post anything at all. The only “waste of time” was assuming the people of this forum were helpful and knowledgeable. Clearly you’re neither. Like you’re telling me to RAID1 a drive that is predominantly for movie and music storage and sometimes overflow for raw video files as they are in the process of being rendered. Why? Why does this require redundancy? What is the point? I back up everything i want to keep offline, that’s enough. I don’t want this on the network, I don’t need this on the network, I don’t have the space physically in my flat for a second storage server, it would be detrimental to my workflow to do this, and I have no reason to do this. So stop telling me to get a NAS. jfc

Have you tried asking chatgpt what you want?

You seem to want a complex system that you dont want to work for, maybe ask an AI to design and do it for you?

If everying else is too much work or asking to buy new stuff, keeping things in status quo is just as fine.

1 Like

Sorry but this is not at all how this reads. You ARE talking to a community of well informed people, many of which are professionals in these fields and have also gone through the EXACT same dilemma you are going through now.

You state your’re lazy and it dictates alot of your decisions on how you want to handle things. Yet you have a complex setup with a mix-mash of drives which makes it a hard thing to solve.

We also look at it from a data security point of view, having all these drives that you can’t do any sensible form of data duplication/raid, etc. is cringe worthy. Spinning rust is unreliable, especially consumer grade SATA drives.

By moving disks to a energy efficient NAS setup on a separate system makes a lot of sense from this point of view. It also gives you room to expand if you want to throw more drives in later without having to mess around with your desktop.

If you’re concerned about eWaste, then save some equipment destined to be eWaste and build a NAS. It’s very easy to obtain outdated servers for very cheap/free and configure them for this purpose.

If you’re concerned with network throughput because the data isn’t local, 1) You’re using spinning rust, throughput already sucks. 2) Your system is now having to manage the IO of multiple drives, and thus many IO queues. 3) It’s not very expensive to setup a 10GBe link between two PCs over Cat5e (yes, you can use Cat5 for 10GBe provided the run is short, I do this)

Edit: Sorry, missread your post, you’re using mostly SSDs

Having a dedicated NAS server gives you the option (especially if it’s recycled eWaste) of using SAS drives, which can be obtained used and very cheap.

At the end of all this, if you just don’t care and want the lazy option, just setup LVM and lump all your disks into one huge volume, but expect a drive failure to cause massive data loss.

Settle down, you asked the public for advice and you’re irritated it wasn’t exactly what you wanted. These people gave you some of their time to try to help you for no cost or expectation from you.

Take a step back and re-asses your attitude here please, it’s not like you paid for a consultant.

4 Likes

I’m irritated because they ignored everything that I said, they ignored my usecase, they ignored every followup reply that I made, and their every response was tangential and unrelated to the question I asked. As far as I’m concerned they derailed this thread and filled it with spam. Your yourself are repeating the idea that I should spend hundreds of pounds on a NAS as well as associated electricity costs when I have explained three times now that that would actively hinder my workflow.

I am going to write this in bold now because I am actually sick of writing it: I know what a NAS is. I have used them in the past when they were applicable to what I needed. They are not applicable to my needs, my work, or my living situation now. I do not need one. I do not want one. Please stop derailing this thread about bcache and BTRFS volumes to obsess over your storage server fetish

I am not the one who came here with an attitude, but my god have you guys been insufferably arrogant and unhelpful.

That’s not relevant for a consumer system that has no mission critical data and everything is regularly backed up. Oh noes I might have to re-download a game from Steam at some nondescript time in the future, whatever am I going to do… Once again you are completely ignoring the usecase of the person you’re responding to and condescendingly making up your own scenario.

My personal philosophy when it comes to data integrity in the consumer space is that as long as I back up everything I don’t want to lose – ideally in multiple places – then I really don’t care about compounding risk through things like RAID0 because that risk is mitigated. Far worse, imo, is people relying on RAID1, and then finding that their corrupted data has also been mirrored across both drives, or relying on BTRFS snapshots and rolling back, only to find that their drive fails and they can’t access their snapshot. There is absolutely nothing wrong with RAID0 if you’re managing data loss risk: which you very well should be doing anyway.

That’s what I currently do (as explained in my OP, it’s using a BTRFS volume, not LVM). This thread was asking for thoughts on two specific alternatives that will strike a better balance between performance and convenience. Neither of which anyone even acknowledged. All of this is in my OP. Do you seriously not understand the irritation of being talked at repeatedly by people who are utterly ignoring the question at hand, answering things that I never asked for a reason?

You guys have made up some imaginary person asking about storage and second guessed their usecase and completely ignored the real person asking a real question in front of you.

You are now not only spamming yourself but you are telling me to reassess my behaviour in the process. That’s hilarious. You don’t get to be this condescending and utterly ignore every single aspect of what someone is telling you and then get shirty with them when they inevitably get a bit irritated. I would have preferred if these people didn’t derail my thread with their irrelevant crap instead of wasting their time and mine preaching about NAS. Just because they’re doing it for free doesn’t make it a valuable use of anyone’s time. fwiw it’s not a kindness on their part, they’re doing nothing but attempting to stroke their own ego and feel knowledgable. I’d say it’s sad that they feel they have this much to prove, but given their failure to actually mention anything relevant to the topic at hand, it’s understandable.

Evidence is to the contrary: there’s been one person in this thread who has even acknowledged BTRFS. In the meantime I found some useful information on Reddit from people who were actually pretty knowledgeable about Linux filesystems and weren’t just trying to stroke their egos with very basic concepts like a NAS.

fwiw I did ask you to lock the thread instead of joining in the spam yourself.

I don’t want a complex system at all. I’m either going to make a large SSD bcache, or I’m going to just sort data manually into two or three BTRFS pools. It’s really not hard. I’m just asking what thoughts anyone else had on either of these two options because I haven’t made my mind up on which to go with, but no one seems to have bothered to read what the thread is even about. Like all the people going on about “solutions”. I don’t want a “solution”. I haven’t presented a “problem”. I have said “I am going to either do X or Y, do you have thoughts?” but no they just see “storage” and start bleating “NAS NAS NAS NAS” without even bothering to read further, and then wonder why I start to get irritated.

Frankly given the amount of reading I’ve done on filesystems, on ZFS, BTRFS, bcachefs, on tiered and cached storage solutions and various other things in that vein, it’s a little insulting to accuse me of not wanting to work for something just because I’ve spent a few days telling people to stop going on about a NAS and to consider answering the question I asked.

If you just have a specific set of rigid choices, you should have probably opted for a poll in your original post. In retrospect, you should have probably posted it on the poll thread instead.

And while I do get your frustration, I seem to be getting a hostile vibe around in your choices of words. The community does want to help. I do see your effort in refining your choice of words and I appreciate your patience with the community.

My partner did say I used too many words lol All the same, I wanted something more nuanced. “I used bchache in the past, and I experience xyz” that kind of thing. Maybe a qualitative statement about cache hits and misses from experience: maybe how many times you have to play a given game or whatever for it to reliably result in a cache hit. I have a theoretical understanding of that from research, but some experienced anecdotes are more what I was looking for here.

Any hostile vibe you are getting is purely because there are only so many times I can read, respond and dismiss the same thing over and over and over again before I get annoyed by it, when really everything is right there in the OP.

It’s honestly reminding me of around 2013/2014 browsing LTT forums and seeing posts like “I’m a noob I have $1000 to spend, do you have any advice” and seeing reply after reply after reply of $1500 systems, $2000 systems… Like yeah these people did that for free but… why? Who are they helping?

1 Like

This is a recurring question (albeit a low volume one) on the forum, the last iteration of which happened here:

where the best solution that we could find to provide a somewhat automatic tiered caching system over an existing filesystem was this github project

Maybe have a look and see whether it fits your use case?

1 Like

Btw, the project is from the storinator guys, so already worth checking and testing in my opinion … not the lab project for a thesis course that I would not be trusting my data (the only copy before backup in your case) with …

But your original question was:

You asked a VERY broad question asking for peoples thoughts on how to better manage your storage. This was not a question of which option would you select out of these, this was a “what are you thoughts”. Most people that responded clearly think that NEITHER option is optimal.

We did not try to explain to you what a NAS is or that you should use one, we provided it AS AN OPTION as it gives you much of the features you are looking for as you can setup a system from recycled eWaste and not only take advantage of the flexibility of the system, but also use a filesystem such as ZFS which can do EXACTLY what you are asking for BUT, it’s best to run it on a separate system with ECC RAM and enough CPU power to do what you want.

Stop thinking that people reaching out to help you is “stroking our ego”. How the heck you even came to this idiotic conclusion is beyond me.

You did not state this in your OP, don’t expect us to guess that you do have a backup mechanism in place already.

Which is why I was talking about real redundency in my suggestion.

Which is EXACTLY what you got, but because you didn’t like the answer, and you did not seem to understand the benefits of what people were suggesting you got hostile.

I got shirty at your moderation report where you insulted this entire community for being “incapable of being helpful or considering anyone’s actual needs before spouting off at them.”

So just because I might have some input to your original question when I see this thread means I can’t respond simply becuse you made a report? What makes you think I do not have an interest in helping you? What makes you think I don’t have many years of experience solving these issues as a professional in this field?

No, it’s not, and you asked a very broad question to a community of people of all walks.

Many in this community are server administrators managing equipment in data centres, we have access to equipment and technology that grants us knowledge you may not be aware of.

When reading your question (even to me), it make me think of a solution to your problem that would be acceptable to most, including the consideration of data redundancy AND performance.

Because this is NOT the question you asked.

You rejected a NAS on the basis of equipment and electricity costs, so I attempted to provide you with options such as recycling old server gear and building an energy efficient system. If you don’t find this acceptable, fine, idea rejected, lets move onto finding some other solution.

We are NOT trying to convince you to do anything! Only to provide you options you may not be aware of or have not thought of.

3 Likes

Thank you for your response. I remember looking at FUSE a while ago, I wonder why I stopped looking at it. It is definitely the ideal for my hardware, if not the easiest to implement.

That is not broad. I gave two options and asked for thoughts on them.

That’s kind of a misinterpretation tbh. ZFS has certain error correction functions that are essentially useless if you don’t use ECC RAM because if the RAM corrupts the data there’s nothing ZFS can do it fix it. Running ZFS without ECC essentially doesn’t have these features, but neither does any other filesystem, so it’s no less stable than using a different filesystem at that point from that viewpoint. In any case using ECC RAM to watch films over the network just seems silly.

I’ve already mentioned this, but if I were going to have a storage server this machine would be it. Depending on what I am sharing I’d either use Samba or SSH to access data, I wouldn’t get another machine. I’m sure someone can count how many times I’ve said a NAS would not be applicable to my situation.

My record for a 4K 60fps video file before transcoding them down is 2TB. This wouldn’t fit on any one of my SSDs so without JBOD/RAID0 I would have to use my massive spinning rust drive (and, in fact, did at the time). Doing this over a 1gbps network would be painful. Mechanical drives are never ideal for this, but sometimes the local overflow is necessary. The 10gbps option isn’t really on the cards because I’d need a new router/switch and because due to the chungus graphics card I have, in addition to the capture card I don’t have the PCIe space for a network card so I’d be looking at a new high end motherboard.

So why did you second guess? I included everything that was relevant. It was your wild filling in of the blanks that’s at fault here, and deciding my usecase for me that is the issue, not that I didn’t say that I back up regularly.

No it isn’t. If you want to see what a helpful, on topic response looks like, read what MadMatt said above you. I didn’t like the “answer” because it had absolutely nothing to do with my question. Why are you still going on about this?

it was the truth. Now you’re arguing with me about what my needs are, as if the point even needed to be laboured.

lmfao. I shouldn’t laugh, but I asked a question about caching and filesystems and I got three days of “buy a NAS”. Should I be offended that you think I “may not be aware of” this? In anycase “this is a good solution in the datacentre, which is clearly your use case. How dare you suggest that I am not taking into account your usecase” is a response that requires some introspection, I think. A big problem I have had in this thread is people, including yourself, repeatedly ignoring my responses about what I actually need my system to do.

I don’t have a problem though. Where are you reading that I have a problem? I have said many times what my use case is, I am thinking of two possible ways of managing it, and if anyone has any considerations about these. All I’ve had is “buy a NAS,” “throw all of your stuff away and re-buy modern equivalents” and (most helpfully) “buy a time machine, go back to 2012 and buy some 2TB SSDs”.

It literally is.

Why did it take three days of me rejecting it to finally get you to relent on this derailment? Why was the first “no” not enough?