Skip to main content

Model Governance Layer

Govern Untrusted AI.

The fundamental challenge of enterprise AI isn't capability — it's trust. Large language models are powerful but untrusted workloads. The Model Governance Layer (MGL) is the boundary that lets you deploy them safely inside trusted healthcare and enterprise environments: enforcing policy at the infrastructure layer, not the application layer, so your AI stays compliant even when it misbehaves.

The Problem: Untrusted Workloads in Trusted Settings

A hospital system doesn't hand a new contractor unrestricted access to patient records on their first day. Yet most AI deployments do exactly that — granting LLMs broad access to organizational data and tools with governance enforced only at the prompt level, if at all. Prompt-level governance is fragile. It can be bypassed, forgotten, or simply wrong. The MGL takes a different approach: governance at the infrastructure layer, independent of the model itself.

The Regulated Environment Problem

Healthcare, finance, and government operate under strict compliance regimes — HIPAA, SOC 2, FedRAMP, ISO 27001. A generalist LLM knows none of this. It doesn't know which users can see which patient records. It doesn't know that your sales team isn't allowed to quote prices without legal review. It doesn't know that your radiology department uses different terminology than the rest of the hospital.

Left unmediated, AI erodes these boundaries.

The MGL Solution

The MGL sits between your users and your LLMs as a transparent, governed proxy. It knows your organization — roles, policies, data access rules, and behavioral expectations. It intercepts every request, decorates it with the appropriate identity and policy context, and governs every response before it reaches the user.

The model never touches credentials it shouldn't. Tools are presented only to agents that are authorized to use them. Every interaction is logged.

How It Works: The cllama Sidecar

The MGL is implemented as a cllama sidecar — a lightweight governance proxy that runs alongside your LLM deployment and intercepts all inference traffic. This is the same architecture used in the open-source clawdapus agent containment framework, adapted for enterprise and healthcare customer deployment.

01
Identify
Resolve user identity, role, and group memberships. Establish who is asking and what policy applies.
02
Govern
Apply behavioral contracts and mediate tool access. Only permitted tools and data reach the model.
03
Deliver
Validate the response against MUST-priority rules before it reaches the user. Log everything at the proxy boundary.

The request lifecycle:

  1. Identity Resolution — The user's identity, role, and group memberships are resolved (Active Directory, LDAP, or custom IAM). The proxy establishes who is asking.
  2. Behavioral Contract Enforcement — Per-role behavioral contracts define what the model must and must not do. These are bind-mounted read-only at the infrastructure layer — the model cannot override them.
  3. Governed Tool Presentation — Only the tools and data sources appropriate for this identity are surfaced to the model. A radiologist sees imaging tools; a billing clerk does not. Tool scope is declared at the infrastructure layer, not in the prompt.
  4. LLM Interaction — The decorated, governed request is forwarded to the model. Real API credentials are held by the proxy — the model runtime never has direct key access.
  5. Response Governance — The response is evaluated against MUST-priority compliance rules before delivery. Non-compliant responses are retried automatically with corrective context (up to five attempts) before escalation.
  6. Audit Trail — Every inference transaction is recorded as a structured, append-only log at the proxy boundary — independent of and inaccessible to the model itself.

Key Capabilities

Behavioral Contracts

Define what your AI must, must not, and should do — encoded as organizational policy rather than prompt engineering. Contracts compose: organization-wide rules, department-level overlays, and role-specific refinements stack cleanly. They're read-only at runtime. The model inherits them. It cannot modify them.

Identity Trust

Every inference carries a verified identity context. The MGL resolves who is asking, what they're authorized to access, and what behavioral constraints apply to their role. This identity envelope travels through the governance layer — it's not something the user or the model can forge.

Governed Tool Presentation

Tools aren't just enabled or disabled — they're scoped. A data query tool might be available to all users, but its permitted datasets differ by role. The MGL compiles per-identity tool manifests at request time. The model sees only what it's allowed to reach.

Credential Starvation

The model never holds real API keys, database credentials, or service tokens. The proxy holds them. This is the credential starvation pattern from cllama: even if a model were to attempt unauthorized access, it lacks the credentials to succeed. Security at the infrastructure layer, not the trust layer.

Ambient Memory Plane

Context that persists across sessions — organizational knowledge, prior interaction summaries, user preferences — is managed by the governance layer, not the model. The model receives curated context injections. It cannot modify, exfiltrate, or selectively forget them. Memory is infrastructure-owned.

Compliance Retry Loop

When a response violates a MUST-priority rule, the MGL retries automatically with a targeted corrective prompt explaining the specific failure. This loop runs up to five times before escalating. The user sees a compliant response or a clear escalation — never a raw governance failure.

Built on Open Standards

The MGL is Flux Inc.'s production implementation of the cllama governance proxy standard, developed in collaboration with the clawdapus open-source project. These are MIT-licensed frameworks built for exactly this problem: deploying AI as an untrusted workload inside environments that require trust.

Why Open Standards Matter in Regulated Environments

Proprietary AI governance is a liability. When an auditor asks how your AI enforces HIPAA-covered data boundaries, "it's in the prompt" is not an acceptable answer. The MGL's governance architecture is based on open, auditable standards — behavioral contracts that can be reviewed, credential isolation that can be verified, and audit logs that can be independently inspected.

Flux contributes to these open standards and implements them for organizations that need production-grade deployment expertise.

Deployment Options

On-Premises

The MGL runs entirely within your infrastructure. No data leaves your environment. The governance proxy runs as a sidecar to your existing LLM deployment — whether that's a self-hosted model, a private Azure OpenAI endpoint, or an Ollama instance. Compatible with OpenAI and Ollama API formats.

Healthcare-Specific Deployment

Pre-configured behavioral contracts for HIPAA-adjacent contexts: patient data access controls, PHI boundary enforcement, role-based clinical tool scoping, and audit logging formatted for compliance review. Integrates with existing LDAP/Active Directory for identity resolution.

Enterprise Deployment

Role-based governance for finance, legal, HR, and operations contexts. Conversational rule management lets authorized administrators modify behavioral contracts through natural language — no developer required. Conflict detection prevents contradictory rules from reaching production.

Hybrid / Cloud

For organizations using cloud LLM providers, the MGL proxy sits at your perimeter — data is governed before it leaves, and responses are governed before they arrive. Real API keys stay in your infrastructure.

Ideal For

  • Radiology and clinical departments needing AI assistants that understand patient data access boundaries
  • Healthcare IT teams deploying LLMs under HIPAA with audit and compliance requirements
  • Legal and compliance teams that need AI operating under firm behavioral constraints
  • Financial institutions with role-based information barriers
  • Government and defense with clearance-based access requirements
  • Any organization that wants AI capability without surrendering governance

Deploy AI You Can Actually Trust

Ready to put governance at the infrastructure layer instead of hoping the prompt holds?