Writing a ZFS like file system from scratch

walterbleach · March 10, 2024, 6:30am

I’m working on a project with a raspberry pi pico (just a cortex m0 microcontroller, no linux). I need to store a bunch of important information that would not exceed double digit megabytes in size.
I decided to use about 4 microsd cards in a raid 1. Checksums are just SHA256 hashes, and data is encryptedusing AES.
Each piece of data will get its own checksum, encrypted, then again another checksum, multiple of these will constitute a file, the resulting file will get a checksum and agian encrypted and then another checksum.
Each file will be written to all 4 microsd cards twice in each of them.

On boot, all checksums will be checked and corrected if any issues were found.
this would protect against single bit failures and drive failures.

The assumption is that the device would boot regularly.

I don’t know much about filesystems and this is just something I came up with. I want this file system to be as robust as possible.
Furthermore, I also have not taken into consideration the possibility of sudden power loss. That could really mess up the encryption beyond repair. I might use something like copy on write for that.

Dutch_Master · March 10, 2024, 7:15am

You’d probably need to reconsider the time you spent on building this vs the accessibility of the data vs how secure that data really needs to be.

IMO, you’re reinventing the wheel, but not quite a round one

cowphrase · March 10, 2024, 7:20am

I don’t know much about filesystems …

Then I’d use something that been written, filesystems have way too many gotchas. For checksums it’s best if you do them in your program and store them in or next to your files, putting them in the filesystem is needlessly complex.

I’ve never used it before, but I would be looking at something like this. GitHub - littlefs-project/littlefs: A little fail-safe filesystem designed for microcontrollers

IMO I’d reassess if you need encryption, and just focus on physical security. As a simple example where are you storing the encryption key? Are you using the same key in all products, or doing it per product? You can also keep minimal logs locally and upload them if you have an internet connection.

ack · March 10, 2024, 7:25am

Checksum → encrypt → checksum that’s then nested inside another checksum → encrypt → checksum is an interesting choice, would go with an authenticated encryption scheme which only encrypts data once myself.

Nothing wrong with reinventing wheels, have to learn things somehow

Good luck!

rand0musername · March 10, 2024, 9:28am

This sounds like a nutty idea, but in a good way! What kind of performance are you expecting from your micro checksum array?

walterbleach · March 10, 2024, 3:02pm

little-fs seems to be perfect, I couldn’t wrap my head around the RAM consumption with large files but little-fs seems to take care of that. this sounds great. there is even a python wrapper for it.

walterbleach · March 10, 2024, 3:07pm

I’m designing a password manager, encryption key is stored in my brain, it’s a password that unlocks everything. however the encryption method that I’m using seems to be not flawless, hence i’m encrypting it a bunch of times to make it hard to decrypt.

And it’s just a password manager for myself that I’ll also opensource so others can use it as well. I just couln’t trust a microsd card with my passwords. I needed some extra reliability features.

cowphrase · March 10, 2024, 4:10pm

That sounds just like 3DES - do the okay thing three times and it comes out good. However it speaks a little badly of your system if you need it - encryption is basically a solved problem these days.

For good encryption find an existing library that does AES-GCM or AES-CBC (or AES-CTR). If you’re putting in a human-memory password, use that to decrypt a larger secret on disk.

ibreakthings · March 10, 2024, 5:02pm

No offense intended, but you don’t seem to have a good grasp on crypto and shouldn’t be rolling your own. Bitwarden, KeepassXC, etc. already exist open source and have been battle tested.

And use a good password key derivation function (Argon2d, etc.). Don’t just take a raw password and use it as a crypto key.

walterbleach · March 10, 2024, 6:01pm

I already use keepass, however I want something that’s completely offline.
There is also the option to buy a raspberry pi zero and install keepass on it, but that wouldn’t be very cost-effective I don’t know.

I guess I should write everything in C, with microcontrollers there is also the limitation of crypto instructions in the processor.

jode · March 10, 2024, 6:03pm

Bitwarden/Vaultwarden.

walterbleach · March 10, 2024, 6:05pm

I’m using AES ECB, which is not very good,

I’m already doing that, the password is hashed and encrypted with its raw hash as the key and then stored on the disk.

ibreakthings · March 10, 2024, 6:44pm

What is your threat model? You say you want 100% offline but KeepassXC (which is 100% offline) is not good enough?

cowphrase · March 11, 2024, 1:27am

Not very good … it’s a piece of cryptographic lego that by itself is practically useless. If you haven’t seen it before, GitHub - robertdavidgraham/ecb-penguin: Demonstrating the famous ECB penguin so that you can repeat the process yourself..

If you want an offline password manager look at Bitwarden or Vaultwarden. Host it locally and sync over VPN.

walterbleach · March 11, 2024, 2:44am

If someone managed to gain remote access to my machine, I don’t want them to be able to access my passwords. If the file is on my computer and can be unlocked that’s not gonna work, logging keyboards are possible even through microphones. Getting the password to that keypass library is not going to be hard.
I could store the keypass library file on a thumb drive but that is not reliable enough for me, they fail.
A device that could handle both security and reliability is the way to go IMO, and it can be detached from my computer which is always online and only attached when I actually need it.
There is also nothing wrong with re-inventing the wheel, I find it fun!

walterbleach · March 11, 2024, 2:51am

I didn’t find any reliability features that are built into these software products.

gnif · March 11, 2024, 3:10am

Not sure if anyone else pointed this out specifically, but the idea of doing a checksum, then encrypting, is flawed. An attacker will know that there is a checksum there and be able to automatically detect successful decryption of data simply by checking the checksum.

For my most sensitive information I store it in a custom developed solution that works similar to LUKS.

Secrets are encrypted with an unknown completely random key. The random key is encrypted with my account’s public key, and can only be decrypted by my account’s private key. This means I can setup a multi-user system to access the same secrets each with their own keys.

My private key is encrypted with a password.

So to unlock the secret you need my private key + a password, and my auth details. The system then on my behalf decrypts the root secret, then decrypts the data I want to retrieve and gives it back to me, never do I have access to the actual root secret.

Care is taken to ensure this root secret is protected when it’s decrypted and nulled out of memory at the end. The client can only access the system over a network using an RPC, direct access to the box that houses the data is not permitted. etc.

Finally, the system that houses the data is running on an encrypted filesystem (LUKS) that needs to be unlocked each boot.

walterbleach · March 11, 2024, 3:13am

And I don’t want to spend $100 on another computer just to store my passwords, I should be able to do it on a $2 ucontroller. It’s not a far fetched idea.

gnif · March 11, 2024, 3:14am

Not at all, just giving you ideas as to a working solution. What you’re talking about is essentally a HSM (Hardware Security Module)

walterbleach · March 11, 2024, 10:06am

Woah Good point!

All good points!

I was just responding to earlier posts mentioning that they don’t understand my reasoning.