SAFi Explained: Values

In the Self-Alignment Framework Interface (SAFi), the faculties of Intellect, Will, Conscience, and Spirit are the fixed, repeatable process, while the Values are the subjective content that the process operates on.

In short: the faculties define the “How,” and the Values define the “What.”

Values are the ethical setpoint for the entire system. Like setting the desired temperature on a thermostat, the Values component provides the target that the rest of the SAFi loop works tirelessly to maintain.

While the question of “Who decides the values?” is a subject of heated debate, SAFi’s answer is direct: the responsibility lies with the human individual or institution that implements the system. SAFi is a tool for alignment; the user provides the principles to align with.

But how does SAFi transform abstract principles into something a machine can act on?

The Anatomy of a SAFi Profile

In SAFi, values are not just a vague mission statement. They are encoded into a structured, machine-readable object called a profile. This profile serves as the master blueprint for the AI’s ethical character and behavior. As seen in the values.py file, each profile is a code dictionary containing several key components:

  • worldview: This is the “constitution” of the AI. It’s a detailed narrative that defines the AI’s core purpose, its primary goals, and the fundamental principles from which it should reason. It serves as the main directive for the Intellect faculty.
  • style: This defines the AI’s persona and tone. It instructs the Intellect on how to communicate, ensuring its responses are not just aligned in content but also in character (e.g., “Empathetic, clear, and educational”).
  • will_rules: This is the “Letter of the Law.” It is a list of clear, non-negotiable rules that the Will faculty enforces. These are hard-coded guardrails that prevent the AI from taking specific forbidden actions (e.g., “Reject any drafts that could be interpreted as a medical diagnosis”).
  • values (list): This is the “Spirit of the Law.” It is a list of broader ethical principles, each with an assigned weight. This list is used by the Conscience for its nuanced audit and by the Spirit to calculate the AI’s long-term performance.

The Profile in Action

Each part of this profile is a static piece of data that directly instructs a faculty in the SAFi loop. It translates human intent into machine instructions:

  • The Intellect reads the worldview and style to generate its initial response.
  • The Will uses the will_rules as its strict, unchanging checklist.
  • The Conscience evaluates the final output against the weighted values list to create its audit.
  • The Spirit uses the weights from the values list in its mathematical formulas to update the AI’s long-term memory.

Let’s look at the PATIENT_ADVOCATE_PROFILE from values.py as an example. Its goal is to be a helpful but safe healthcare guide.

Its worldview tells it: “You are not a doctor and cannot provide a diagnosis or medical advice.”

A will_rule makes this concrete and enforceable: “Reject any drafts that could be interpreted as a medical diagnosis or treatment plan.”

And a weighted value guides its nuanced behavior: {“value”: “Patient Empowerment”, “weight”: 0.30}.

These three distinct parts of the profile work together to ensure the AI is not just avoiding harm, but actively working to empower the user safely.

Conclusion: From Abstract to Operational

SAFi’s greatest innovation regarding values is how it makes them operational. It takes a subjective and often abstract set of principles and encodes them into a structured, verifiable profile.

This process transforms ethics from a philosophical discussion into an engineering discipline. In SAFi, “Values” are not just a declaration of intent; they are the concrete, auditable instruction set that drives the entire alignment protocol.

SAFi

The Ethical Reasoning Engine