Version: v0.25.0 (Latest)

Deployment Topology

The EDK enterprise images are split because the network surface, the access pattern, and the operational lifecycle of each role are genuinely different. The platform container is the central control service. Tenant-KMS is the internal cryptographic authority. DID, tenant-AS, issuer, and verifier are tenant runtime services. The admin console is a UI container served on the platform host.

Public Versus Internal Surface

Each container has both a public and an internal surface, but the proportions are very different.

The platform exposes the operator authorization-server protocol surface and the /admin-console UI on platform.<base-domain>. Setup and platform-admin APIs are available only under the platform host and must be protected after first run. The platform service is also the central configuration/control service that other services call internally.

The KMS exposes no public surface at all. The provider admin REST and KMS command receiver are internal-only. Issuer, verifier, tenant-AS, and DID call into it across the cluster network. Operators reach KMS administration through platform administration and internal service policies, not by publishing KMS to the internet. It is a deliberate constraint: nothing about KMS belongs on the public internet.

The DID container has a small public surface (the Universal Resolver path) and a larger internal surface (did:web publishing, did:webvh log management, the manager admin REST, method enable/disable per tenant). Public DID resolution is its primary job, so the deployment template binds the resolver path to the public ingress and keeps everything else internal.

The AS, Issuer, and Verifier each have a wallet-facing public protocol surface and a tenant-administrator internal admin surface. The protocol surface is what wallets and external clients hit (/oid4vci/..., /oid4vp/..., /authorize, /token, /.well-known/...). The admin surface lives under /api/v1/... and is meant for tenant operators and the platform admin only. The split is enforced at the ingress: the public ingress allows only the protocol paths and .well-known URLs through; the internal ingress carries /api/v1/... and requires a bearer JWT with the right scopes.

Service-to-Service Auth

Once a request lands on the issuer, the verifier, or the AS, those containers may call the platform for tenant/platform configuration, the KMS for signing, the DID resolver for verifying counterparty signatures, or each other for cross-protocol operations. Those calls cross trust boundaries inside the cluster, so they are not unauthenticated.

The Helm chart exposes service-to-service auth in one of two ways:

Service JWT. Each runtime container holds a service identity backed by a signing key on the KMS. Outbound peer calls carry a short-lived JWT with the service principal's claims, and the receiving container validates it against the KMS-published JWKS. This is the default for the published Helm chart.
mTLS. When the cluster already runs a mesh (Istio, Linkerd, Cilium) or has its own internal CA, the cluster's mTLS suffices and the service JWT becomes optional. The Helm chart has a switch for this.

Either way, the principle is the same: the KMS does not trust an unauthenticated caller, and the issuer/verifier/AS do not trust each other on the strength of being on the same cluster network.

Peer Transport

The enterprise images package the generated gRPC client stubs and routed -remote command implementations. Platform and tenant-KMS run the inbound gRPC receivers. The runtime services route platform-config commands to the platform service and KMS commands to tenant-KMS over the internal network.

grpc:
  enabled: true
  port: 9090
  authMode: service-jwt

The public gateway never routes gRPC. NetworkPolicy or mesh policy should allow gRPC only on the east-west paths that need it: runtime services to platform, runtime services to tenant-KMS, and tenant-KMS to platform.

Data Plane Databases

The enterprise deployment uses two PostgreSQL database roles by default:

Platform database. The platform service owns this database. It stores first-run setup state, license activation state, the application tenant, tenant registry, routing records, platform configuration, and the platform-only authorization server state.
Tenant workload database. Tenant-KMS, DID, tenant-AS, issuer, and verifier read and write tenant runtime data here. Isolation is schema-per-tenant: each tenant gets its own schema, for example tenant_acme and tenant_globex, and the database router sets the tenant schema on each request.

The platform service may connect to the tenant workload database for tenant onboarding and schema provisioning, but its own platform tables are not co-located with tenant runtime tables. Runtime services do not connect to the platform database.

When a customer requires stronger isolation than schema-per-tenant, the same database routing layer can route selected tenants to a dedicated database or PostgreSQL instance. Connection pooling is per-target via HikariCP. The container code is unchanged; only the routing configuration changes.

Per-tenant configuration that runtime services read through the TenantConfigPropertySource lives in the tenant workload database under the resolved tenant schema. Tenant admins write configuration through the admin REST; runtime services pick it up on the next resolver cache miss and through Postgres LISTEN/NOTIFY invalidation.

Cross-Replica Cache Invalidation

Each runtime container caches the tenant routing table, the per-tenant config, and the public-endpoint bindings in process. With multiple replicas behind a load balancer, a mutation on replica A must be visible on replica B without a restart, or behaviour diverges between replicas.

The EDK solves this through the shared event subsystem: admin commands emit domain events on mutation, a Postgres LISTEN/NOTIFY bridge fans them out to subscribers in every replica, and the local caches invalidate. A TTL fallback covers the case where a notification is missed. The mechanism is the same for tenant routing, public-endpoint bindings, and tenant_config_property updates.

Tenant-Aware Public Endpoint Bindings

A tenant typically owns at least one host that the wallet reaches it at: in the hosted EDK model that is <tenant>.<base-domain>, for example acme.example.com; in a custom-domain model it might be wallet.acme.com. The tenant_public_endpoint row binds a tenant + service type (OID4VCI issuer, OID4VP verifier, OAuth AS, DID resolver) to the host and optional path layout (.well-known path layout, pathPrefix) under which that service is reachable for that tenant.

The metadata that runtime services advertise (the credential_issuer value in OID4VCI metadata, the issuer in OAuth AS metadata, the request_uri_base in an OID4VP authorization request, the status_uri in a credential offer) comes from the tenant public-endpoint binding rather than from the bare request host. This matters in production: when the tenant is reached via a CDN, a reverse proxy, or a custom domain that does not match the cluster's hostname, the metadata still advertises URLs the wallet can actually reach.

The Helm chart's default behaviour is fail-closed: if no tenant_public_endpoint binding exists for the resolved tenant and service type, the runtime service refuses to advertise anything rather than falling back to the request host. The fallback is configurable per environment.

Putting It Together

A canonical small-to-medium deployment runs:

One platform replica for setup, license activation, operator AS, platform admin, and platform config.
One replica of each tenant runtime container (tenant-KMS, DID, tenant-AS, issuer, verifier) sized to the workload, on Kubernetes through the edk-enterprise Helm chart.
One admin-console replica routed at platform.<base-domain>/admin-console.
A platform PostgreSQL database for platform state, plus a tenant workload PostgreSQL database using one schema per tenant. Use separate managed instances when operational isolation requires it.
A public gateway with TLS termination, routing platform.<base-domain> to the platform and admin-console paths, and <tenant>.<base-domain> to DID, tenant-AS, issuer, and verifier by path. Use a wildcard certificate for *.<base-domain> plus the operator host, or automate individual certificates for every tenant host and the operator host.
An internal ingress (or a service-mesh policy) that only the in-cluster admin paths reach, enforcing bearer-JWT auth at the gateway or in-process. The KMS admin REST sits behind this same internal ingress.
A monitoring stack subscribed to /metrics on each container and to the OpenTelemetry collector wired through the EDK telemetry module.

The chart defaults to Nexus image coordinates under nexus.sphereon.com/edk-docker, with the delivered enterprise tag supplied through global.imageTag and registry credentials supplied through global.imagePullSecrets. The default KMS service has no public ingress.

For a high-traffic deployment, the issuer, verifier, AS, and DID containers scale horizontally. The KMS scales as well, but more conservatively, most deployments find the bottleneck is the provider backend (AWS KMS rate limits, HSM throughput) rather than the container itself.

Public Versus Internal Surface​

Service-to-Service Auth​

Peer Transport​

Data Plane Databases​

Cross-Replica Cache Invalidation​

Tenant-Aware Public Endpoint Bindings​

Putting It Together​