To align SAFi with the spirit of community and transparency, I have switched the entire system to run on open source Large Language Models (LLMs) instead of proprietary commercial ones. While this move presented unique challenges, the trade-offs have been more than justified, resulting in a faster, more adaptable, and truly independent ethical reasoning engine.
A Huge Increase in Speed
SAFi is heavy on API calls, making up to four sequential calls for a single prompt, so latency and cost is a major concern. With commercial models, SAFi’s response time was often between 10-20 seconds. After switching to open source models, that time plummeted to under two seconds.
This is a massive improvement in user experience, making the interaction feel fluid and conversational.
The Trade-Offs: Quirks and Configuration
Of course, the transition wasn’t without its challenges. Open source models are often less fine-tuned “out-of-the-box” compared to their commercial counterparts. They can be quirky, and their raw state requires careful testing and prompt engineering to get the configuration just right for SAFi’s complex reasoning loop.
The Open Source Advantage
Ultimately, the benefits of making this switch are clear:
- Accessibility: The models are freely available to download, inspect, and run, fostering a more transparent and collaborative development ecosystem.
- Performance: The speed increase is dramatic, bringing SAFi’s responsiveness on par with, and in some cases faster than, major commercial chatbots.
- Control: Direct access to the models allows for deeper and more granular fine-tuning, which is essential for a nuanced system like SAFi.
The Models and Vision for Hosting
We are currently using a mix of models, each selected for the specific task of the faculty it serves:
- Intellect:
openai/gpt-oss-120b
- Will:
llama-3.3-70b-versatile
- Conscience:
openai/gpt-oss-20b
- Summarizer:
llama-3.1-8b-instant
Right now, I’m using Groq to host these models, which certainly helps explain the incredible speed increase. My long-term goal is to self-host the models. This would reduce latency even more and, critically, achieve true privacy by preventing data from flowing outside the network—a necessary step for running AI in highly sensitive fields such as healthcare, finance, and governance.
How to Switch Models in Your SAFi Instance
One of the best parts of this transition is how simple it is to configure SAFi to use different models. If you’re running your own instance, you can swap out the LLMs with just a few steps.
1. Install the Provider’s Python Package
First, install the necessary Python client for your model provider. For example, if you were using Groq, you would install their package:
pip install groq
2. Update the Configuration Files
Next, modify a few lines of code to point SAFi to the new models and client.
In config.py
: Update the model names to the ones provided by your new host.
# Example for config.py
INTELLECT_MODEL = "openai/gpt-oss-120b"
WILL_MODEL = "llama-3.3-70b-versatile"
CONSCIENCE_MODEL = "openai/gpt-oss-20b"
SUMMARIZER_MODEL = "llama-3.1-8b-instant"
In orchestrator.py
: Change the OpenAI
client initialization to point to your new provider’s API base URL and use the appropriate API key.
# Example for orchestrator.py inside the SAFi __init__ method
# --- After (Example for Groq) ---
self.groq_client = OpenAI(
api_key=getattr(config, "GROQ_API_KEY"),
base_url="[https://api.groq.com/openai/v1](https://api.groq.com/openai/v1)",
)
In faculties.py
and orchestrator.py
: Ensure that every call to client.chat.completions.create
uses the new client you initialized (e.g., self.groq_client
).
That’s it. By changing just a few lines, you can harness the power and speed of different open source models, making SAFi not only a powerful ethical reasoning tool but also a flexible and future-proof platform.