These are the fundamental mathematical objects that form the foundation of SAFi:
Interaction Index: \(t\) represents the discrete interaction index (the turn number in a conversation)
Input Context: \(x_t\) captures the input context, including the prompt and associated metadata
Value Set: \(V = \lbrace(v_i, w_i)\rbrace\) represents our declared value set with corresponding weights, where \(\sum w_i = 1\)
Draft Response: \(a_t\) is the draft or answer generated by the Intellect
Will Decision: \(D_t \in \lbrace\text{approve}, \text{violation}\rbrace\) represents the Will’s decision
Reasoning: \(E_t\) contains Will’s reason string explaining the decision
Conscience Ledger: \(L\_t = \lbrace(v\_i, s\_{i,t}, c\_{i,t})\rbrace\) maintains the conscience ledger per value, with:
– Score: \(s_{i,t} \in \lbrace-1, 0, +1\rbrace\) (or scaled values)
– Confidence: \(c_{i,t} \in [0,1]\)
Spirit Score: \(S_t \in [0,1]\) or \([1,10]\) measures spirit coherence for the current turn
Memory State: \(M_t\) stores memory of prior audits, profiles, and running aggregates
Timing and Execution Model
– The Intellect and Will faculties run synchronously (the user waits for the response)
– The Conscience and Spirit faculties run asynchronously (background processing)
– Memory updates occur once background audits complete
Stage 1: The Intellect
The Intellect generates the initial response and reflection:
\(a_t, r_t = I(x_t, V, M_t)\)Where \(r_t\) is a short internal reflection.
Stage 2: The Will
The Will makes a binary decision, approve or violation:
\(D_t, E_t = W(a_t, x_t, V, r_t)\)If \(D_t = \text{violation}\):
– Return a rejection message to the user
– Record minimal event: \(\lbrace t, x_t, a_t, D_t, E_t\rbrace\)
– Abort downstream stages for this turn
If \(D_t = \text{approve}\):
– Return \(a_t\) to the user immediately
– Enqueue background audit job: \(J_t = \lbrace t, x_t, a_t, V, M_t\rbrace\)
Stage 3: The Conscience
For each value \(v_i\) in the value set \(V\), the Conscience evaluates:
\( s\_{i,t},\ c\_{i,t},\ q\_{i,t} = G\_i(a\_t,\ x\_t,\ v\_i) \)The complete ledger is then composed as:
\(L\_t = \lbrace(v\_i,\ s\_{i,t},\ c\_{i,t},\ q\_{i,t})\rbrace\)
Stage 4: The Spirit
Spirit Score Computation
The spirit score aggregates weighted value assessments: \(S\_t = \sigma\!\left(\sum w\_i \cdot s\_{i,t} \cdot \varphi(c\_{i,t})\right)\)
Where:
– \(\sigma\) is a scaling function (identity or logistic)
– \(\varphi(c)\) downweights low-confidence rationales
Profile Vector and Moving Average
The profile vector for the turn: \(t\):\(p_t = w \odot s_t\)
The updated moving average: \(\mu\_t = \beta \mu\_{t-1} + (1-\beta)\, p\_t\)
Drift Calculation
Drift measures deviation from historical patterns: \(d\_t = 1 – \cos\_{\text{sim}}(p\_t,\, \mu\_{t-1})\)
Memory Update
The Spirit processes the audit results to update the system’s memory state: \(M_{t+1} = U(M_t, L_t, S_t, \mu_t, d_t)\)
Feedback to the Intellect
A simple, natural-language coaching note \(f\_t\) is generated from the results of the update (specifically from \(S\_t\) and \(d\_t\)) to steer the Intellect in the next turn.
Type System and Function Signatures
The mathematical type discipline ensures system consistency:
– Intellect: \(I: (x_t, V, M_t) \rightarrow a_t\)
– Will: \(W: (a_t, x_t, V, r_t) \rightarrow \lbrace\text{approve}, \text{violation}\rbrace\)
– Conscience: \(C: (a_t, x_t, V) \rightarrow L_t\)
– Spirit: \(S: (L_t, V, M_t) \rightarrow S_t, d_t, \mu_t\)