I use Python and Ollama. Python to load files and Ollama as my model endpoint. If any guides reference openai or you can use this structure
Otherwise the major libraries have integrations for ollama
https://docs.llamaindex.ai/en/stable/api_reference/llms/ollama/
Now for the rag component, you might want to look at these guides.
In terms of programming libraries, langchain is great for experiments, but if you need it in production, people prefer llamaindex which has a assortment of guides too
https://docs.llamaindex.ai/en/stable/optimizing/production_rag/
https://docs.llamaindex.ai/en/stable/getting_started/concepts/
https://docs.llamaindex.ai/en/stable/use_cases/q_and_a/
If you still wanna do langchain, the documentation has some guides
If you don’t want to get too deep into coding, you can also use lmstudio and upload files into a chat, though if you want to use it for work you’ll need to fill out a form
Someone has posted their experiences with it