Knowledge Base LLM
Use a Vector Database to retrieve source knowledge from a Knowledge Base (Prolog) and let an LLM use the source to generate answers.
Create Knowledge Base
First we want to create a Knowledge Base (Prolog) that the LLM can reference for answering questions.
It also helps to
- Improve embedding accuracy for better results.
- Reduce text input to improve LLM efficiency.
- Provide precise information sources for users.
- Split long chunks of text that exceed context window.
Step by Step
- Tokenize the text (e.g. with tiktoken).
- Use langchains RecursiveCharacterTextSplitter to generate chunks of text
- Create Embeddings for all chunks
- Initialize a Vector Database and store the vectors in there
- Use the RetrievalQA chain with a LLM and the Vector Database to generate answers for questions
- We can also use the RetrievalQAWithSourcesChain to make the answers even more reliable.