[Devember 2021] Distributed WAN filesystem for media (actually working!)

Tahoe-LAFS is a distributed storage system that works over the internet. We intend to use it with a group of people to combine spare storage space on our servers to one large storage pool. In this storage pool we can store our neatly organised “Linux ISOs” for everyone to stream.

Unfortunately, Tahoe-LAFS does not have a FUSE mount client. This is my project (the blue part in the bottom image).

Really no FUSE client?

Technically, there already is a FUSE client but it’s unmaintained, and requires Python 2.

The official recommendation is to use Tahoe-LAFS SFTP system with SSHFS, but this is hilariously bad. When reading one byte from a file, it needs to download it entirely.

The FUSE filesystem is read-only. For uploading files, I developed an rsync-like upload program.

Why read only?

Tahoe has immutable and mutable files. To implement uploading files in our filesystem we’d need to use mutable files; we don’t get the entire file in one go but small chunks of it. Unfortunately, the Tahoe API for writing to parts of mutable files is completely broken and the Tahoe-LAFS developers aren’t interested in fixing it at this time. Mmy solution was to not implement writing files and instead upload files using a separate upload script. A read-only filesystem is nice anyway because it projects against accidental deletions. In contrary to a FUSE filesystem, the upload script knows the file size beforehand and can upload the file in one go, so it can use immutable files.

6 Likes

I’m looking forward to the progress of this project, so please keep us updated

although a “write-able” FS would be great, seperating it off with an upload script allows many security (and access contention) issues to be avoided, plus it means anyone who uploads to their own drive allocation, automatically becomes available over the WAN FS

1 Like

Thank you! Here’s an update for today:

The mount client now supports options in the standard linux mount command format, which means it can be used in /etc/fstab. Documentation and example: tahoe-pyfuse3/README.md at 68f8ca5e52e549d82bca3a2e2e590df547b654e2 · Derkades/tahoe-pyfuse3 · GitHub

In addition, Debian packages are now available.

1 Like

This week, I fixed some minor bugs and redid the debian packages “properly” using debhelper. This is the first time I’m making debian packages so it was a good learning experience.

We will start using the filesystem and upload program in “production” soon!

We have started using the filesystem and upload program in production. It looks like it should be mostly bug-free at this point.

The repository has been split up into two:

This way they can have independent release schedules.

1 Like

Wait wut… This project is somewhere between insane and genius! :rofl: I can’t wait for project updates :wink:

nice to see you got this fully working @Derkades , great work.

for anyone else interested in distributed filesystem, the following might be interesting:

Not much has changed (I’ve been busy playing Minecraft, oops) but I’ll post an update anyway. In fact, there isn’t much left to add or fix anymore regardless. We’ve been using it in production for a few weeks without issues. Performance is only bottlenecked by Tahoe-LAFS at this point, not my mount client.

The mount client now supports logging to syslog. I’ve also done some performance and code readability improvements. tahoe-upload is now much faster when the directory hasn’t changed at all, requiring only one request per directory instead of one request for each child in the directory before.

1 Like

Tahoe-LAFS performs best with many nodes and it’s hard for people to build a large network on their own. There was a public grid in the past but unfortunately it was shut down because of abuse.

I was thinking, the average Level1Techs forum user is probably trustworthy enough, so what if we built a community storage network? Share some non-redundant storage, get back geo-redundant storage.

Would anyone be interested in this concept?

Like StorJ / Tardigrade among level1techs ?..

… or actual Tahoe-LAFS?

It would be Tahoe-LAFS, but yes Tahoe-LAFS is sort of like Storj. Both use erasure coding. Both store data on a decentralized network of storage nodes. Storj depends on a sattelite, Tahoe-LAFS depends on an introducer for discovery (however, unlike Storj, Tahoe-LAFS doesn’t rely on a satellite for storing file metadata!). Tahoe-LAFS doesn’t have the payment aspect. Tahoe-LAFS is a lot slower (because it is fully decentralized).

You can consider the Tahoe-LAFS network as running your own “private Storj network” without payments.

Sure, count me in ^^

its readonly, so that covers half the security, the other half being “trust as to what a user makes available is not malicious”

as long as you can register/connect to a know point , I dont see a problem with a distributed LevelOneTechs Community Filesystem - I would imagine it would be quite useful for a lot of solutions and problem setups

Well, Tahoe-LAFS itself is not read only and has support for immutable and mutable files and directories. However, mutable files are still experimental and had some issues so I chose not to support it in my filesystem.

Any file uploaded to or directory created on the Tahoe-LAFS network is assigned a unique URI (‘capability’) that also contains the encryption key. A user can share this URI with someone else to grant access to a file or directory (either a read-only URI or a URI with write access).

There is no security concern here. The worst that can happen is users using a lot of storage space without contributing anything back.