Provider-Portable AI Architecture

Your path to AI-powered compliance without vendor lock-in. Use Amazon Bedrock today, Google Vertex AI tomorrow, Azure AI Foundry next week, or all at once.

5 Provider Adapters

1 Unified API

3 Routing Strategies

Zero Vendor Lock-in

Without Gateway	With Gateway
Provider-specific code in apps	One API, any provider
Scattered governance	Centralized controls
Expensive provider switching	Configuration-based routing
Manual cost optimization	Intelligent auto-routing
Fragmented observability	Unified tracing and logs

Without Gateway

With Gateway

Provider-specific code in apps

One API, any provider

Scattered governance

Centralized controls

Expensive provider switching

Configuration-based routing

Manual cost optimization

Intelligent auto-routing

Fragmented observability

Unified tracing and logs

Strategy	Optimizes For	Use Case
Cost	Minimize spend while meeting quality thresholds	High-volume, non-critical workloads
Performance	Minimize latency for interactive experiences	Real-time compliance Q&A
Quality	Maximize output quality for critical decisions	Regulatory filing review
Hybrid	Balance all factors dynamically	Default for most workloads

Strategy

Optimizes For

Use Case

Cost

Minimize spend while meeting quality thresholds

High-volume, non-critical workloads

Performance

Minimize latency for interactive experiences

Real-time compliance Q&A

Quality

Maximize output quality for critical decisions

Regulatory filing review

Hybrid

Balance all factors dynamically

Default for most workloads

Implementation Roadmap

A structured approach to implementing the Provider-Portable AI Architecture in RegRiskIQ.

Phase 1: Foundation (Weeks 1-3)

Phase 2: Quality Assurance (Weeks 4-6)

Phase 3: Governance (Weeks 7-10)

▶

Provider-Portable AI Architecture Implementation

Enable RegRiskIQ to leverage multiple foundation model providers without vendor lock-in

#9114 Epic

This epic delivers a complete provider-portable architecture enabling RegRiskIQ to use AWS Bedrock, Google Vertex AI, Azure AI Foundry, OpenAI, and Ollama interchangeably. The architecture separates AI consumption from AI provision through a Model Gateway pattern with centralized governance, observability, and quality assurance.

▶

Centralized Prompt Registry

Version-controlled prompt management with provider-specific variants

#9115 Phase 1

Establish a centralized registry for managing AI prompts as versioned, deployable artifacts. This enables A/B testing, rollback capability, and provider-specific optimizations.

▶

Create prompt_registry database schema

#9122

Acceptance Criteria (BDD)

Given a new RegRiskIQ database migration

When the migration is applied

Then a prompt_registry table exists with columns: id, name, version, template, provider_variants, created_at, is_active

And the version column has a unique constraint with name

And provider_variants is a JSONB column supporting arbitrary provider keys

▶

Implement Prompt Registry CRUD API

#9123

Acceptance Criteria (BDD)

Given an authenticated API request

When I POST to /api/prompts with a valid prompt template

Then a new prompt is created with version 1

And the response includes the prompt ID and version

Given an existing prompt "regulatory_analysis"

When I PUT to /api/prompts/regulatory_analysis with updated content

Then a new version is created (immutable versioning)

And the previous version remains accessible

▶

Add provider-specific prompt variants

#9124

Acceptance Criteria (BDD)

Given a prompt with provider_variants for "openai" and "anthropic"

When the ModelManager requests the prompt for provider "anthropic"

Then the anthropic-specific variant is returned

And Claude-specific syntax (Human:/Assistant:) is applied

Given a prompt without a variant for provider "bedrock"

When the ModelManager requests the prompt for provider "bedrock"

Then the default template is returned

▶

Implement prompt version rollback

#9125

Acceptance Criteria (BDD)

Given a prompt "risk_scoring" with versions 1, 2, and 3 (active)

When I POST to /api/prompts/risk_scoring/rollback with version=2

Then version 2 becomes the active version

And an audit log entry is created with the rollback details

And subsequent AI requests use version 2

▶

Task-Intent Routing

Route AI requests based on task intent, not just cost/performance

#9116 Phase 1

Extend the ModelManager to understand task intent (regulatory_analysis, risk_scoring, etc.) and route to the optimal provider/prompt combination based on intent-specific requirements.

▶

Define TaskIntent enum

#9126

Acceptance Criteria (BDD)

Given the model_manager.py module

When I import TaskIntent

Then the enum contains: REGULATORY_ANALYSIS, RISK_SCORING, POLICY_SEARCH, COMPLIANCE_CHECK, DOCUMENT_SUMMARIZATION

And each intent has a string value matching its lowercase name

▶

Extend TaskContext with intent fields

#9127

Acceptance Criteria (BDD)

Given a TaskContext dataclass

When I create a new TaskContext

Then I can specify intent, data_classification, compliance_framework, and tenant_id

And all new fields are optional with sensible defaults

And existing code using TaskContext continues to work without modification

▶

Implement intent-to-prompt mapping

#9128

Acceptance Criteria (BDD)

Given a prompt registered with intent "regulatory_analysis"

When an AI request is made with TaskIntent.REGULATORY_ANALYSIS

Then the ModelManager automatically selects the correct prompt

And the provider-specific variant is applied based on the selected provider

▶

Log routing decisions for audit

#9129

Acceptance Criteria (BDD)

Given any AI request processed by the ModelManager

When a provider and prompt are selected

Then an audit log entry is created with: timestamp, intent, selected_provider, selected_model, prompt_id, prompt_version, routing_strategy, tenant_id

And the audit log is queryable for compliance reporting

▶

AI Observability Dashboard

Unified metrics visibility across all AI providers

#9117 Phase 1

Create a unified dashboard exposing AI metrics including provider performance, cost tracking, quality scores, and integration with existing Trust Architecture confidence metrics.

▶

Aggregate ModelManager metrics

#9130

Acceptance Criteria (BDD)

Given the AI Models Service is running

When I GET /api/ai/metrics

Then I receive per-provider statistics: request_count, success_rate, avg_latency_ms, total_cost_usd, error_count

And metrics are available for all 6 providers: openai, azure, bedrock, vertex, ollama, anthropic

▶

Integrate Trust Architecture confidence metrics

#9131

Acceptance Criteria (BDD)

Given the Trust Architecture service is processing requests

When I view the AI Observability Dashboard

Then I see TRAQ confidence scores aggregated by provider

And I see rejection rate due to low confidence thresholds

And I see citation accuracy metrics per intent

▶

Create Grafana AI dashboard

#9132

Acceptance Criteria (BDD)

Given Grafana is deployed with the RegRiskIQ observability stack

When I navigate to the "AI Provider Performance" dashboard

Then I see panels for: latency (p50, p95, p99), cost per provider, success rate, requests per minute

And I can filter by time range, provider, and intent

▶

Provider Equivalence Testing

Validate that switching providers maintains output quality

#9118 Phase 2

Build an evaluation harness with golden datasets to prove that switching from Provider A to Provider B does not degrade output quality beyond acceptable thresholds.

▶

Curate golden evaluation datasets

#9133

Acceptance Criteria (BDD)

Given the evaluation_datasets table in the database

When I query for datasets by intent

Then I find at least 50 queries for each TaskIntent

And each query has a human-validated expected output

And queries are representative of production traffic patterns

▶

Implement batch evaluation runner

#9134

Acceptance Criteria (BDD)

Given a golden dataset for "regulatory_analysis"

When I run: python -m evaluation.runner --intent regulatory_analysis --providers openai,bedrock

Then all queries are executed against both providers

And outputs are stored with provider, latency, cost, and raw response

And a comparison report is generated

▶

Calculate quality metrics

#9135

Acceptance Criteria (BDD)

Given evaluation results from multiple providers

When quality metrics are calculated

Then semantic similarity score is computed using embedding cosine similarity

And citation accuracy is measured against expected sources

And factual consistency is evaluated using NLI models

▶

Generate provider comparison reports

#9136

Acceptance Criteria (BDD)

Given completed evaluation runs for Provider A and Provider B

When I generate a comparison report

Then the report shows per-intent quality scores for both providers

And a go/no-go recommendation is provided based on >90% equivalence threshold

And specific failing cases are highlighted for review

▶

End-to-End AI Tracing

OpenTelemetry-based trace propagation across all AI services

#9119 Phase 2

Implement distributed tracing that correlates requests from the UI through the API Gateway, ModelManager, and Provider Adapters, enabling full visibility into AI request lifecycles.

▶

Add trace_id to ModelManager

#9137

Acceptance Criteria (BDD)

Given an AI request with an incoming trace context header

When the ModelManager processes the request

Then a child span is created under the parent trace

And the span includes attributes: ai.intent, ai.provider, ai.model, ai.tokens.input, ai.tokens.output

▶

Correlate traces with audit logs

#9138

Acceptance Criteria (BDD)

Given a trace_id from an OpenTelemetry span

When I query the rag_query_audit table

Then I can find the corresponding audit entry by trace_id

And the audit entry includes the full request/response context

▶

Add tracing to provider adapters

#9139

Acceptance Criteria (BDD)

Given a request routed to the Bedrock adapter

When the adapter calls the AWS Bedrock API

Then a child span is created with: provider.name, provider.region, provider.model, provider.latency_ms, provider.cost_usd

And errors are recorded as span events with full exception details

▶

Tenant-Aware Gateway

Multi-tenant support with per-tenant rate limits and preferences

#9120 Phase 3

Enable multi-tenant operation of the Model Gateway with tenant isolation, per-tenant rate limiting, and tenant-specific provider preferences. Conditional on multi-tenant deployment requirements.

▶

Add tenant_id to request context

#9140

Acceptance Criteria (BDD)

Given an authenticated request with a JWT containing tenant_id claim

When the request reaches the ModelManager

Then tenant_id is extracted and added to TaskContext

And all downstream operations are scoped to that tenant

▶

Implement per-tenant rate limiting

#9141

Acceptance Criteria (BDD)

Given tenant "acme" has a rate limit of 100 requests/minute

When tenant "acme" sends their 101st request in a minute

Then a 429 Too Many Requests response is returned

And the response includes Retry-After header

And other tenants are not affected

▶

Add tenant provider preferences

#9142

Acceptance Criteria (BDD)

Given tenant "acme" has preferred_providers: ["bedrock", "azure"]

When the routing engine selects a provider for tenant "acme"

Then only bedrock and azure are considered

And openai and ollama are excluded from routing decisions

▶

Data Residency Routing

Route requests based on data residency and compliance requirements

#9121 Phase 3

Implement simple, configuration-driven routing rules to ensure data residency compliance. Providers in disallowed regions are automatically excluded from routing decisions.

▶

Add allowed_regions to tenant config

#9143

Acceptance Criteria (BDD)

Given a tenant configuration schema

When I configure tenant "eu_bank" with allowed_regions: ["eu-west-1", "eu-central-1"]

Then the configuration is validated and stored

And the routing engine can query allowed regions for any tenant

▶

Filter providers by region

#9144

Acceptance Criteria (BDD)

Given tenant "eu_bank" with allowed_regions: ["eu-west-1"]

And provider "bedrock" is configured for region "us-east-1"

When the routing engine selects providers

Then "bedrock" is excluded from eligible providers

And only EU-region providers are considered

▶

Log data residency decisions

#9145

Acceptance Criteria (BDD)

Given a routing decision that excludes providers due to data residency

When the decision is logged

Then the audit log includes: tenant_id, allowed_regions, excluded_providers, selected_provider, reason

And compliance officers can query all residency-based routing decisions

Ready to Move Forward?

Your path to provider-portable AI compliance starts with a structured engagement.

Architecture Review
with Technical Teams

Pilot Deployment
Scope Definition

Provider & Policy
Configuration

Production
Deployment

OpenAI	GPT-4, GPT-3.5, embeddings with streaming
Azure OpenAI	Enterprise deployments, Managed Identity
AWS Bedrock	Claude, Titan, Llama, Cohere (13+ models)
Google Vertex	Gemini, PaLM, Model Garden
Ollama	Local models, zero API cost

Provider-Portable AI Architecture

The Strategic Challenge

Immediate Value Delivery

Strategic Flexibility

Governance Requirements

Cost Optimization

The Model Gateway Architecture

Why This Pattern Works

Key Capabilities

Intent-Based Routing

Prompt Registry

Policy Engine

Provider Adapters

Observability

RAG Independence

Routing Strategies

Architectural Value Proposition

Strategic Advantages

Freedom of Choice

Optimized Economics

Consistent Governance

Future-Proofing

The Hyperscalers Agree

AWS Reference Architecture

Azure API Management

Google Vertex AI

The Strategic Takeaway

RegRiskIQ Implementation Status

✓ What's Been Built

Core Model Abstraction Layer

5 Provider Implementations

Intelligent Routing Engine

→ Phase 2 Enhancements

Provider Equivalence Testing HIGH

Multi-Tenant Gateway MEDIUM

Data Residency Routing MEDIUM

Architectural Trade-offs

What We Navigate

Provider Differences Are Real

Gateway Adds Latency

Testing Requires Investment

How We Mitigate

Adapter Test Harness

Performance Budgets

Continuous Evaluation

Implementation Roadmap

Provider-Portable AI Architecture Implementation

Centralized Prompt Registry

Create prompt_registry database schema

Acceptance Criteria (BDD)

Implement Prompt Registry CRUD API

Acceptance Criteria (BDD)

Add provider-specific prompt variants

Acceptance Criteria (BDD)

Implement prompt version rollback

Acceptance Criteria (BDD)

Task-Intent Routing

Define TaskIntent enum

Acceptance Criteria (BDD)

Extend TaskContext with intent fields

Acceptance Criteria (BDD)

Implement intent-to-prompt mapping

Acceptance Criteria (BDD)

Log routing decisions for audit

Acceptance Criteria (BDD)

AI Observability Dashboard

Aggregate ModelManager metrics

Acceptance Criteria (BDD)

Integrate Trust Architecture confidence metrics

Acceptance Criteria (BDD)

Create Grafana AI dashboard

Acceptance Criteria (BDD)

Provider Equivalence Testing

Curate golden evaluation datasets

Acceptance Criteria (BDD)

Implement batch evaluation runner

Acceptance Criteria (BDD)

Calculate quality metrics

Acceptance Criteria (BDD)

Generate provider comparison reports