Skip to main content

Phi Gateway

Many models. One API.

The AI gateway for protected health data. For developers, one OpenAI-compatible base URL — no SDK rewrite, no provider keys in the application. For IT and compliance, a control point at the wire where every request is identified, policy-checked, credential-isolated, and recorded before it leaves your boundary.

Drop in and route

Phi Gateway speaks the OpenAI API. The application changes one line. The gateway handles identity, policy, provider keys, and audit.

from openai import OpenAI

# Same SDK. New base URL. No provider keys in the app.
client = OpenAI(
    base_url="https://phi.your-hospital.internal/v1",
    api_key="agent:abc123...",  # identity token issued by Phi Gateway
)

response = client.chat.completions.create(
    model="anthropic/claude-sonnet-4",  # provider/model — any route policy allows
    messages=[
        {"role": "user", "content": "Summarize the findings in this report."}
    ],
)

Address any model as provider/modelopenai/gpt-4o, anthropic/claude-sonnet-4, google/gemini-2.0-pro, local/llama-3-70b. Phi Gateway resolves the route, swaps in the real provider credential, forwards the request, and records the transaction. The application never holds a provider key.

How a request flows

01
Identify
Resolve caller identity from the bearer token. Establish which organization, role, and policy apply to this request.
02
Police
Apply per-role policy from Model Governance Layer: which prompts, tools, data classes, and routes are allowed.
03
Route
Swap the identity token for the real provider credential. Forward to the allowed upstream — BAA cloud, private cloud, or local LLM.
04
Record
Log the transaction at the boundary: caller, route, model, decision, latency, tokens, cost. Independent of the model. Suitable for compliance review.

The four mechanisms — policy, routing, credential isolation, audit — are not features layered on. They are the four phases of every request, enforced at the infrastructure boundary on the way out and on the way back.

Cloud or on-prem

Same API surface, same SDK, two deployment shapes.

Phi Cloud

Flux-hosted. One endpoint, BAA with Flux, customer or Flux-pooled provider keys.

Built for evaluation, early deployments, and mid-size practices that want the control plane without operating it.

Phi On-Prem

Customer-deployed. Docker image, OVA, or Windows installer. Customer keys, customer perimeter, customer-chosen routes including local LLMs.

Built for hospitals, IDNs, regulated enterprises, and air-gapped sites where the trust boundary cannot move outside the network.

Phi Vault

The local store inside your perimeter — provider keys, identity tokens, the audit ledger, and optional de-identification mappings for workloads where reversible tokenization is sound. Ships with Phi On-Prem.

Not a magical PHI scrubber. Reversible tokenization works for structured extraction, classification, and controlled-schema report transforms where the model preserves tokens. Free-form chat that paraphrases or invents identifiers is out of scope. The vault is credible because its scope is explicit.

For developers

Drop-in compatibility

The OpenAI SDK works unchanged. Existing clients, agent frameworks, and internal tools point at one new base URL. Model strings move from gpt-4o to openai/gpt-4o so the gateway can route.

No provider keys in the application

Phi Gateway issues identity tokens scoped to your agent or service. Real provider keys live at the gateway. A compromised client has nothing to leak — its token is useful only inside the boundary.

Audit and cost for free

Every request is logged with caller, model, tokens, latency, and cost. Operator dashboards surface spend per agent and per provider. The same JSON feeds external SIEMs, billing systems, and the MGL fleet view.

Same shape as the open standard

Phi Gateway is built on the cllama credential-starvation proxy. The wire protocol, identity model, and audit log shape are the public cllama specification — applications built against Phi Gateway run against any cllama-compatible proxy.

For IT and compliance

Policy at the infrastructure layer

Per-organization and per-role rules authored in Model Governance Layer. Policy is enforced by infrastructure, not prompt — a billing workflow and a radiology assistant do not share permissions.

Audit-ready evidence

Caller, route, model, decision, latency, tokens, and cost recorded for every request. Append-only, independent of the model, suitable for HIPAA, PHIPA, or GDPR special-category review.

BAA path or your perimeter

Phi Cloud carries a BAA with Flux. Phi On-Prem keeps Flux out of the data path entirely — the gateway, the keys, the ledger, and the de-identification mappings all live inside your network.

Deployment forms you already operate

Docker container, OVA appliance, or Windows installer — the same packaging pattern used by the rest of the Flux medical fleet. No bespoke runtime, no proprietary orchestrator.