Proposal 028: Service Schema Catalog¶
Status¶
Draft / Under Discussion.
Problem¶
service_type in an offer's payload is currently an opaque string. The daemon
routes it and Arca executes it, but neither the UI nor any workflow author has
a machine-readable description of what a particular service_type expects in
request/input.
The request/input field is intentionally an open JsonValue — no inner
validation by the daemon. This openness is a strength: it lets providers evolve
their contracts without a central registry. But without any opt-in way to
advertise what a type accepts, every integration becomes tribal knowledge.
Consequences today:
- Workflow authors must read Arca source code to know which fields are valid.
- The step-builder UI in
workflow_definitionshardcodes arequest_inputtextarea withplaceholder="{}"— no field hints, no examples. - Two providers offering
text/redactionmay accept entirely different shapes with no way for a consumer to distinguish them before making a request.
Context¶
plan_schema_path (Proposal V2) solved a related problem one layer up: the
daemon can proxy a module's per-workflow_kind step-field schema so that the
UI renders a structured builder instead of a raw JSON textarea.
But plan_schema_path describes the step envelope (which step fields exist, what
types they have) — not the domain content inside request/input. The two
concerns are complementary:
- step-field schema → how to build a plan in the UI
- service schema catalog → what to put in
request/inputfor eachservice_type
Idea¶
Each middleware module that provides offer-services may expose a schema catalog endpoint:
GET /v1/schema-catalog
The response is a JSON array of schema descriptors:
[
{
"service_type": "text/redaction",
"schema_id": "text-redaction.v1",
"description": "Redacts personally identifiable information from plain text.",
"json_schema": {
"type": "object",
"properties": {
"text": { "type": "string" },
"redact_kinds": { "type": "array", "items": { "type": "string" } }
},
"required": ["text"]
},
"examples": [
{
"label": "Basic redaction",
"value": { "text": "Call me at +48 123 456 789.", "redact_kinds": ["phone"] }
}
]
}
]
This endpoint is purely descriptive. No code in the daemon or the calling client is expected to branch on the schema — it exists to guide human authors and tool-assisted form builders.
Registration¶
The daemon already maintains several proxy registries
(WorkflowKindRegistry, HostCapabilityRegistry, LocalRouteRegistry,
ModuleDispatchRegistry). Each is a specialised BTreeMap rebuilt from the
module report on startup. Adding a per-kind dedicated registry for every new
catalog would mean copy-pasting the same rebuild() / proxy pattern each time.
Instead, the module report gains one generic array:
"catalog_endpoints": [
{ "catalog_kind": "schema", "path": "/v1/schema-catalog" }
]
The daemon builds a single CatalogEndpointRegistry
(BTreeMap<catalog_kind, Vec<CatalogEndpointEntry>>), rebuilt from all active
module reports, and exposes one aggregated proxy per kind:
GET /v1/catalog/{catalog_kind}?…
For this proposal, the concrete endpoint becomes:
GET /v1/catalog/schema?service_type=text/redaction
The aggregated endpoint makes it trivial for the Node UI (or an external tool)
to fetch the descriptor for whatever service_type appears in an offer without
knowing which module provides it.
Adding a new catalog kind in the future (e.g. template from Proposal 029)
requires only a new entry in catalog_endpoints — no new registry struct, no
new daemon handler.
Offer-level Schema Reference¶
Optionally, individual offers may carry a schema_ref field pointing directly
to a descriptor — either a URL to the module's schema catalog or an entry
within a published community collection:
{
"service_type": "text/redaction",
"schema_ref": "http://127.0.0.1:7702/v1/schema-catalog#text-redaction.v1"
}
This is a hint, not an authority. Consumers may ignore it. It enables point-to-point schema exchange without a central registry while still being compatible with one.
Stratification¶
The schema catalog idea fits naturally into a four-layer stack:
| Layer | Name | What it contains |
|---|---|---|
| 0 | Open contract | request/input: JsonValue — any shape accepted |
| 1 | Named schemas | per-provider GET /v1/schema-catalog — opt-in description |
| 2 | Domain ontologies | community-curated service_type namespaces (e.g. text/*, image/*, audio/*) with agreed base schemas |
| 3 | Workflow templates | pre-built plan definitions referencing known service_type values with example inputs |
A provider can stay at Layer 0 indefinitely — no breaking change. A provider that wants to participate in a shared ontology moves to Layer 1 voluntarily. Layer 2 and 3 emerge from convention among participants, not from daemon-enforced constraints.
This mirrors how the web grew: open TCP → HTTP → agreed media types → REST conventions. The daemon stays thin; the market builds the ontology.
Relationship to Existing plan_schema_path¶
plan_schema_path (V2/V3) is orthogonal:
plan_schema_path→ UI knows how to build a workflow plan step-by-step.catalog_endpoints[schema]→ UI (and humans) know what to put inside each step'srequest/input.
In practice both endpoints could be served by the same module process.
plan_schema_path remains a dedicated field in workflow_kind_handlers because
it is tightly scoped to WorkflowKindRegistry; the schema catalog is a
module-level, service_type-keyed resource and fits the generic
catalog_endpoints slot instead.
The Node UI step-builder could load GET /v1/catalog/schema?service_type=… for
the selected service type and render inline field hints inside the
request_input textarea or, eventually, replace it with a structured sub-form.
UI Integration (Future)¶
Once the daemon exposes GET /v1/catalog/schema?service_type=…, the
workflow-definition form could:
- Fetch the descriptor when a user types a
service_typevalue. - Show
descriptionas a tooltip or inline help text. - Render
examplesas a dropdown of starter templates for therequest_inputfield. - Eventually: replace the raw JSON textarea with a generated sub-form driven
by
json_schema.properties.
Steps 1–3 are low-cost and immediately useful; step 4 can follow the same HTMX server-side pattern as the step-builder.
Risks and Non-Goals¶
Schema should be purely descriptive. The daemon must not branch on schema content. If a request does not match a schema, that is a business-level concern handled inside the provider module — not a transport-level rejection by the daemon. Introducing validation at the daemon layer would couple the daemon to domain knowledge and break the open-contract principle.
No mandatory registry. Providers that do not declare a schema entry in
catalog_endpoints continue to work exactly as today. The feature is entirely
opt-in.
No schema versioning protocol (yet). schema_id carries a version suffix
by convention (e.g. text-redaction.v1), but there is no negotiation mechanism.
Consumers should treat the schema as advisory documentation, not as a
compatibility guarantee.
Community ontologies are out of scope for the daemon. The daemon proxies
schemas; it does not curate them. A separate orbidocs or community project is
the right home for agreed service_type namespaces.
Consequences¶
Positive:
- Workflow authors gain machine-readable documentation for
request/inputshapes without changing the open-contract daemon architecture. - Node UI can progressively enhance the step-builder without a big rewrite.
- Market participants can build a shared ontology through convention, not through central enforcement.
- The
catalog_endpointsregistration pattern reuses the existingrebuild()-from-module-report idiom already present in all daemon registries — low implementation cost, zero new structural patterns.
Trade-off:
- One more optional endpoint per provider module to maintain.
- Schema drift (provider changes shape without updating schema) is invisible to the daemon — consumers must handle unexpected shapes defensively regardless.
Minimal Viable Step¶
- Add
catalog_endpoints: Vec<CatalogEndpointDecl>toMiddlewareModuleReportwhereCatalogEndpointDeclis{ catalog_kind: String, path: String }. - Daemon builds
CatalogEndpointRegistryfrom active module reports on startup and exposesGET /v1/catalog/{catalog_kind}as a generic proxy. - Arca serves
GET /v1/schema-cataloglisting its knownservice_typevalues and declares{ "catalog_kind": "schema", "path": "/v1/schema-catalog" }in its module report. - No UI change required for MVP.
Proposal 029 reuses the same catalog_endpoints slot with
{ "catalog_kind": "template", "path": "/v1/template-catalog" } — the daemon
machinery is identical.