One of the key challenges in working with open-source Large Language Models (LLMs) is their knowledge cutoff date. Their training data is static, meaning they are unaware of events, research, or custom data that emerged after their last training cycle.
How can we solve the problem of providing up-to-date or proprietary information for the conversations an LLM generates?
One of the most effective solutions is Retrieval-Augmented Generation, or RAG.
The Challenge: An LLM’s Knowledge Gap
The Self-Alignment Framework (SAF) was completed and published on this website in January 2025. As a new and specialized framework, there is no information about it on the public internet, and consequently, it is not part of any existing LLM’s training data.
If you were to ask a standard AI chatbot, “What is the Self-Alignment Framework?”, you would likely receive a speculative or “hallucinated” answer. LLMs are designed to generate responses, and without a definitive source of information, they often construct plausible-sounding but incorrect information rather than stating they don’t know.
Our Solution: Implementing RAG for SAFi
This is precisely the problem that RAG is designed to solve. With a RAG architecture, you can establish your own knowledge base to serve as the single source of truth for the AI. For our SAFi chatbot, this ensures that the answers it generates are based exclusively on the official documentation of the framework.
Our initial RAG implementation is straightforward but effective:
Document Repository: We’ve created a knowledge base consisting of approximately 15 markdown files that detail the Self-Alignment Framework. These documents are indexed by their titles, allowing for efficient retrieval.
Custom Chat Interface: We developed a custom WordPress plugin to embed the chat interface you see on this website, providing a seamless user experience.
This setup demonstrates the modularity of the SAFi engine, showcasing its ability to be expanded with powerful functions like RAG to serve various applications.
The Path Forward: Refining Our RAG System
The field of Retrieval-Augmented Generation is deep and multi-layered, and we are continuously learning and improving our implementation.
Chunking Strategy: One of the most critical aspects of RAG is “chunking”—the method of breaking down large documents into smaller, semantically relevant pieces for the LLM to process. We are actively refining our chunking strategy to improve the context provided to the model, which will directly enhance the quality and relevance of the chat’s responses.
Open Source Project: We believe in transparency and collaboration. A Git repository for this specific RAG setup has been established and is available under the “saf” project on our official GitHub site.
As we continue to develop this system, the results you get from using the chat on this website will become increasingly accurate and helpful. We invite you to explore the SAFi chat and see our RAG implementation in action.