I tried very hard to shorten that title. I wanted it accurate, descriptive, and free of buzzwords. What I’m building isn’t “AI for healthcare” or anything flashy — it’s a tool meant to help an average person survive the absolute nightmare that is working with insurance companies, armed only with their own medical records.
I’ve been unable to work since late 2023. My employer did more than they had to — genuinely — and long-term disability insurance was part of my benefits package for a reason. Working with them was stressful, but manageable… until January of 2024.
A new case manager was assigned mid-review. From the very first call, the message was clear — unstated, but unmistakable: my coverage would end, and they would find a “reasonable reason” to do it.
That was the start of this project.
The problem isn’t documents
I had documents. Lots of them.
Hundreds of files. Thousands of pages. And easily four times that number in accidental duplicates. “Save everything” is good advice, but reality is messy — providers resend the same thing in different formats, systems change, humans make mistakes.
I’ve got documents, they are less useful — and much more uncomfortable — than toilet paper.
The real problem isn’t documentation.
It’s context , narrative , focus , and engagement .
Turning thousands of disconnected pages into something coherent, defensible, and human-readable is the hard part.
A concrete slice of the problem
So I started breaking the problem into smaller vertical slices, working when I could. Some days I can’t make progress at all. Other days I get a few hours.
Here’s one of the more concrete POCs to come out of that work.
I was given a single PDF, over 1,500 pages long. Multiple providers. Multiple templates. Multiple document styles. And I needed to answer one question. The answer was in there somewhere — maybe one sentence long.
This repo shows that proof-of-concept:
https://github.com/Yummyfudge/poc_embedding_query_engine
I embedded the document using BAAI/bge-large-en-v1.5, used a Qwen-based inference model, and wrote code that understood the shape of the question I was asking. It took me about two days to reliably get the answer.
The harder part is turning that into something general.
This is not “just RAG”
What I’m building isn’t a chat system and it’s not “throw context at an LLM and hope.” It’s a query engine that deliberately separates responsibilities:
- LLMs for reasoning
- ML for representation
- Math for ranking
- Code for control
That separation is what might make it reliable. (In theory.)
So far? I’d give the current state a C-. It answers some questions well and misses others embarrassingly. That’s honest.
Going upstream: document intake and OCR
Upstream from that work, I also built another POC focused on document intake and OCR:
https://github.com/Yummyfudge/poc-ocr
The goal there wasn’t just OCR, but normalization and comparison. Medical documents are messy — you can’t just hash them. I ended up running documents through multiple OCR pipelines, assigning confidence scores, and using deterministic fingerprints on fuzzy-matched content.
The result was a system that could reliably accept or reject documents as duplicates, even when they weren’t byte-identical, across providers and vendors.
Why this matters (to me)
Insurance companies have teams of people whose job is to piece together context, narrative, and timing across thousands of pages.
Individuals don’t.
There is nothing available — at any price — that helps an individual take a mountain of paperwork and turn it into a coherent, evidence-backed appeal. Most existing tools focus on retrieval, summarization, or expensive enterprise analysis.
I want to build a tool that assists in crafting a strong narrative — pulling together multiple versions of the same puzzle piece, focusing the argument, and producing something human-readable and defensible.
Not to replace judgment.
Not to automate appeals.
But to give someone fighting for their benefits a fair chance.
Thanks for reading to the end, and for the consideration.