The Ultimate Home Server - Component: Knowledge Repository

i’ve tried evil mode in emacs but found that while it meets the basic vim workflow it’s not great when doing more advanced things. I’ve even gone as far as using spacemacs but found I would fight the editor more than I like

1 Like

I am just starting to use a intel NUC with VNC and getting obsidian sorted on it a security focused friend likes it because of the markdown ability.

Feel like that’s the quickest and best way for me and my NUC has a job!

1 Like

Well current state of knowledge repository is distributed between OneNote, Markdown files written using Typora. Currently have most files stored on folders on an Unraid Share. Definitely want to move everything to a unified interface.

My main issue is integrating physical media basically typewritten and written notes.

1 Like

For a base file format, most of my stuff is stored in PDF, but I hate Adobe and working with PDFs. Has anyone used something based on XPS?

1 Like

I use DokuWiki. No MySQL required everything is just in plain and simple files I can back up quickly and easily. It might not really be a secure solution to put on the open internet but for home i can access it local or through a VPN it is just fine.

For storing info about my desktop setup (what’s connected to what using what port), I use my own (open source) app: Connectatron: Open-Source node editor for planning networks of devices

It would work well for home server hardware documentation too! Much better than spreadsheets or pencil and paper.

1 Like

I also use Obsidian. I’m using my own implementation of to organize my notes and link ideas together. I’m paying for sync until I get my self-hosted sync back up.

I also use MyMind to enter quick notes from the browser and to keep track of my bookmarks (I always forget to export them when changing distros lol) and articles I find interesting.

The interesting thing about their service is that they use AI to assist in searching through your bookmarks by adding ‘tags’ to your bookmark. So if i bookmark a spicy chicken recipe called ‘yummy yum recipe’ it, in theory, should be able to add the tag chicken recipe to it. So when im hungry and searching for a chicken recipe, i will find that article when i search ‘chicken recipe’ and forget the name of the article.

I have found it to be a double edged sword when you are looking for a specific article and the search terms bring up a billion results that do not get narrowed down. I may be missing a strict search toggle though… :man_shrugging:

Along with that I use Notion to create ‘manual pages’ for various projects and for Task and Project management using Thomas Frank’s Project Management Template.

1 Like

For “Knowledge Repository” I still use old pen and ink and scan it using a Fujistu SnaSnap Si300i. The ScanSnap Manager software by ABBYY creates searchable PDFs and is worth the price alone.
Caveats. Windows OS only, OCR excellent on typewritten documents but is mostly useless on handwritten items. Have not used the web services that come with it so can’t comment

I use “Okular” for highlighting and adding pop-up notes to PDFs. It has a sidebar that shows you all highlights and notes(unfortunately organized only by the order they were added). You can also add bookmarks that show page numbers and rename the bookmarks. Closest thing I have found to Adobe Acrobat Pro for markup puposes

I… just realized it’s Vi. I litter text files in directories. Generally using SSH for access if I’m not physically around the set of machines I work and store data on. It does tend to keep my notes nestled right up next to the content they’re about.

I didn’t even realize I kept “inline” notes 'till I was watching the video.

I use Google Keep for ephemera. But that’s more along the lines of “buy the eggs” or “replace the serial USB adapter” where I might have a different device handy at the site.

So in school we used Note Taking App - Add Text, Images, Audio, Checklist | Zoho Notebook Now it worked but it wasnt as good as dictating notes. It was AMAZING on ipads and surfaces.

Honestly Ive been doing my best to find or augment and open source solution but its just not holding up to it. I watched your video

I do agree there is a need for this and the market is trash.

For books I wanted to run calibre BUT I ended up just sorting this out in my nextcloud. Now granted Wendell I am curating physical copies of my books.

Now granted I bought these books as well. I do wish to support the author.

All these are just tiny components I think of the vision you want to see where you have a sort of grand concept of a cataloging server. It may have to work in a way where software integrates with an ingress server that can sort oh hey this is a book we should put this pdf on this server with this piece of software etc.

There is this but its literally in early beta

Now yes I realize the way it acquires that information is well very controversial but its a decent ish concept

@wendell @SgtAwesomesauce I guess the first thing to do in this task is something that can feel overwhelming trying to pinpoint. If I were attacking this concept the first thing I would do is a very “engineer” thing to do.

Map out my knowns? What do I need? What do I want? What is the core functionality of such an integrated concept service that I need first. Then expand the diagram only after I finish the core. So to speak


This is something that I am still very much working on, and don’t have a good solution.

In terms of book archival, calibre is awesome. It’s not for your “active” books, like anything that you want to annotate, instead more for a “clean” copies or “finished” copies of books or large documents. It has tons of plugins for things, and since it written in python and has the ability to add custom metadata columns, it is relatively easy to extend. For example, I have a script that lets me get the total wordcount of my library, or a subset of the library.

I’ve tried multiple things for web reading / reading on mobile, and have not found a great solution yet. Both the built in web GUI, and the third party calibre-web are not what I am looking for. The android sync app was incredibly slow (and you have to “connect” to the server EVERY time), especially trying to do searches. The ODPS feed from calibre-web is slightly better, but searching does not handle custom metadata, and ODPS does not sync reading progress afaik. Syncthing for the entire library did not work, as Syncthing + Android file APIs + an SD card is a recipe for never finishing scanning a large directory.

I still need something for book annotations (e.g. highlight, make notes, etc). But, really, I want onenote combined with like three or four other pieces of software.

I should be able to throw PDFs and other documents at it, and have it do OCR on them and autocategorize them. It should have organization, and metadata (both tagging and standard fields for document). It should have both search on titles/metadata, and full text search from the metadata. It should be self-hostable, with both a web GUI, and apps for both desktop and mobile platform, with offline sync. For the mobile end (or touchscreen laptop end), having the capability to do drawing would be good, both for notes and diagrams. And perhaps a bit of stretch, but handwriting recognition would be awesome (actually also on the document ingestion end as well). But I also should be able to use it like a wiki, with a text based formatting (probably markdown) files that can integrate pictures, diagrams, etc. And for editing, it needs to have a preview window, and spellchecking (languagetool ideally), and a way to export to standard markdown, so I can draft in the software and then put the final document into mkdocs.

The individual pieces all exist, and (mostly) exist as open source code (exceptions being good pdf editing and perhaps handwriting ocr?). But putting it all together is the trick. It’s basically the same thing @pfeiferj

For task list, I am using vikunja. Since can act as a caldev server, other stuff should be able to integrate with it in the future. There are some papercuts with it still for me, but given the pace of development, that the author dogfoods (e.g. using an instance as the roadmap), and that there is now a hosted version for financial support, I am hopeful that those will go away.

1 Like

oooof. Do you have a good resource for getting a solid self host of this going?

I have this massive mess of it from my school and learning days. 465 books and counting

Is calibre well suited to this? How does it do in detecting stuff without meta data? Almost all of these are literally references I look back to and highlight (temporarily) and stuff but ive never had a great time with this

Note this is also an almagamation of other friends going through school who needed somewhere to stash their shit too. So it needs work but if I can move it out of nextcloud to something purpose built thats a better first step than where it is at

See ive always supported open source software for these tasks but ive also said I would more than gladly pay for these things.

Wait Readarr and Calibre Integrate? Thats actually nice

I swear the components of what wendell wants, what we want all exists. Just needs repurposing and integration. The thing is each component is kind of clunky and overweight with what its supposed to do so to speak. hmmmm

In note taking, I prefer platforms with markdown compatibility.

  • Outline: For the team’s KB

  • Simplenote: For my personal note taking; Not satisfied with it yet, and I really haven’t found anything that meets what I’m looking for (full markdown support, OSS, version history, clients on all platforms), but it’s doing the job for now.

My “Knowledge Repository” is wiki.js, a wiki software where i write only markdown pages, cause the syntax is easy and if ever necessary i would be able to just move it over to another wiki that supports markdown. It is synced to gitea (git server) so i can always just do a git clone before i do critical maintenance on the server i host wiki.js. On my android phone i use GitJournal, it syncs with gitea, automatically shows a markdown preview and stores the whole wiki on the phone so i can access it even when i have no internet. And the best thing is i can even do edits on the phone and it syncs back to gitea and from there to wiki.js.
I use it for less than 2 years but a recent paper backup was over 130 pages long… I use it to document how my servers are set up, any software/linux tricks and kung-fu i wouldn’t remember cause i need it only occasionally, and life-things like checklists for moving to another place or getting a new car. Whenever i search the internet for a solution to a problem and it takes longer than a few minutes to find the solution, i add it to the wiki so next time i search it again i know exactly where to find it fast. For me this was such a life changer and time saver (in the long run), cause before i had those infos in various text files and couldn’t find the information when i needed it.

Then a password manager is also a must have. I use KeepassXC, synced via Nextcloud to my phone and tablet, with a Yubikey for 2FA. I have way over 300 passwords in there. It makes it so easy to get a long and unique password for every site, that’s so important now that so many user/password combinations are leaked from hacked sites, cause i know a leaked password doesn’t give access to any other site. And KeepassXC can also store notes (like how to log in or how long a premium account lasts) and files (i.e. a geli key for the old zfs geli encryption) and it can even unlock ssh keys.

And third is a document management system. I chose paperless-ng cause it has OCR, a good way to automatically tag documents and good search filters to find documents easy. Before i had just placed my documents on my desk, then i had to clean up cause someone visited and then i would just throw them i a big bag. When i needed a document i couldn’t find it or searched for up to 1-2 hours. Now i get most documents in digital form (when they come via mail they get automatically imported), and the few ones i still get in paper are scanned periodically. So whenever i need a document i find it now in less than a minute.

And the last thing is an RSS reader, at least in tt-rss you can just star an article and it will never get automatically deleted (you could it even set it up to never delete anything but the database might get pretty big and slow). For RSS feeds that include the whole content of an article (some sadly only do headlines or a short introduction) this is a good way to save them practically forever.
And it’s a HUGE timesaver if you want to get the latest news from many sites.
For sites that don’t support RSS (sadly more and more, especially those social media ones) there is rss-bridge which is really good at parsing a website and generating a rss feed out of it, and it supports adding your own bridges for sites that aren’t supported natively.
Pro tip to track github repos: add /.atom to the url of the release or commit site of a project.

Some mentioned calibre here. As i left all my books at my parents house when i moved out, i have all books stored as ebook or pdf. I use calibre-web (instead of installing applications locally i try to run as may things as possible via docker on my server and access it via browser, so i can access them also remotely), but i haven’t found a good way to connect my e-reader (Kobo Aura HD) to it via wifi. The 3rd party reader app koreader can connect to calibre-web but the search feature doesn’t work well and i think there’s no way to sync read progress or notes. So i tend to use the webbrowser of my e-reader to access calibre-web to download a new ebook it hasn’t already stored locally.
But calibre-web just uses the default calibre database, so i could always just run calibre on my pc and sync it to the ebook via usb, but i haven’t done that for years. And i also don’t read that many books so i haven’t invested that much time into my calibre setup yet.


Ya this thread is my jam. I have a reading and writing issue. I have notebooks (Journals, brain dumps, meditations, task management) and 200+ books (only the keepers), all analog, that i would like to track better.

Currently I do it with a Libre Calc sheet. But it would be nice to be able to pull down books stats, ie. title, author, pages, genre, etc.

This person has some stats game, but to much data entry.

It would be nice to integrate with a wiki.

On the computer side of things, Gnote is an awesome place to brain dump. At this time it is mainly linux and computer based information.

Yeah, same here, vi is my principal input tool.

My biggest issue is not collecting stuff, it’s finding what I’ve written later. I collect/write code, scripts, txt, html, “office” docs (MS xls or Libre odt or whatever the platform supplies, WordPerfect probably somewhere), PostScript, PDFs, images, music/sound, videos, zips of above and on and on. What I’m really looking for is an indexer so that I can do a local search through all of that, plus my browser’s tabs and bookmarks, my shell history files and various system headers, man pages, etc.

Back in the early 2000s I set up Swish-E to crawl my machines, so I could search around, but it was a chore and I was never really satisfied with how it worked. Biggest problem was parsing and indexing non-text files, like pdfs and office-like proprietary formats.

Now that I’m looking, I’ve got about 2 million files in my personal archives, dating back about 40 years to when I started saving crap on disk (everything before that is on paper or tape or punch cards, gah!). I’m sure there is a lot of duplication in there, but where? Where’s that C or C++ code that computes training impulse that I wrote in 1992-5 or whenever it was?

I currently have ~2,500 items in my library, so yup, your ammount would be fine.

By default it pulls in whatever is embedded inside the book. Sometimes that is just about everything, other times that can be nothing.

There are plugins to pull metadata from lots of different places (wikidata, barnes & noble, goodreads, etc).

1 Like

Yeesh. I hear that. Usually if I’m searching for something I’ve got a shell with all the directories available abstracted to the system at hand. With computing power having gotten infinitely better over time it’s usually “grep -ir” or “find ./ | grep -i” recursively.

Young me would’ve shuddered at the prospect of the wait :).

Isn’t that just hoarding? how much of the 2500+ items have you touched or read?

This is like the Smaug Dragon from the hobbit.

First of all, this is my first post on the forum. Wendell’s video spurred me into making an account just so I could discuss this topic, so I’m sorry if my reply is somewhat inept.

I’m an academic (kinda, I’ve finished my Master’s, but I still live in an R&D world in my job), so taking notes and also annotating stuff is really important for me. Over the years I’ve been thinking about what I need for an organization system, and have come up with many of the same requirements as pfeiferj. I would say that I probably value PDF annotation above nearly everything else, since I need it basically all the time for annotating scientific articles. Also, since I’m mostly dealing with scientific articles, an included reference manager would be a must.

My dream solution would have as a starting point a system that allows for plain text annotations (I don’t trust the longevity of binary formats), with support for Markdown and at least basic LaTeX (numbered equations shoud really cover most of what I need). Ideally, this system could also be used as a layer which would exist on top of PDFs to allow for better annotation. Also, highlighting PDFs would also be a nice to have. Finally, the ability to add and search tags is a must.

I had recently started looking into possible solutions which satisfy all my needs, and took a basic look into the following Software, categorized by their adherence to the features I need:

Software Open Source Self-Hosted Plain text Notes PDF Annotation Reference Manager LaTeX Support Markdown Support Mobile Pen Input Writing OCR
Zettlr y y y n y y y n n n
Joplin y y y n n y y y ? ?
Xournal++ y y y y n y n y y ?
Polar y n n y y n n y y ?

You may notice that these options fall into different categories of software (text editors vs. note-taking apps). None of them really tick all of the boxes. This was just an initial search I performed, and I posted these here in hopes of receiving feedback from people who have used them, or at least starting a discussion on their suitability.

I’ve also recently looked at hypothesis, but haven’t had a chance to test it more thoroughly.

Finally, looking into the future, I would also like something that would allow me to include code into my notes. For what something like that might look like, take a look at Quarto.

Thank you, pfeiferj, for giving me a great starting point for this reply. Thank you, Wendell, for starting this whole discussion. I am so incredibly happy to find people with whom I can discuss these kinds of things, because they have been bugging me for years now.

P.S.: There are some more things I would need from an academic standpoint, but they are out of the scope of this thread. If anyone wants to discuss them, though, please do reach out to me.