Przejdź do treści

Proposal 064: Inquirium Implementation Recommendations

Based on:

  • doc/project/40-proposals/063-inquirium-model-inquiry-organ.md
  • node/DEV-GUIDELINES.md

Status

Draft

Date

2026-05-20

Executive Summary

This proposal refines Proposal 063 into implementation recommendations for Inquirium. The goal is to keep Inquirium as a small, host-owned policy boundary for model-backed inquiry while allowing many replaceable runtime adapters, protocol gates, evaluators, tool bridges, agent bridges, and model adaptation artifacts to evolve around it.

The core rule is:

Inquirium Core owns semantic operations and policy.
Adapters translate execution details.
model-runtime owns lifecycle, health, supervision, and transport mapping.
Provider workers execute inference but hold no Orbiplex authority.

Adapters may be middleware-hosted, but middleware hosting is an implementation axis, not an authority grant. A middleware-hosted adapter remains an Inquirium runtime adapter with a narrow role.

Model output remains candidate evidence, not authority. Every rejection, failure, degraded result, fallback, continuation, retry, plan, tool call, artifact, and adaptation result must be represented as a typed outcome.

Design Commitments

These recommendations are filtered through the Node development guidelines:

  • keep the trusted core small;
  • keep data contracts explicit;
  • authorize before effects;
  • fail closed when authority or configuration is missing;
  • avoid ambient filesystem, network, tool, model, or agent authority;
  • prefer local-first operation;
  • preserve exportable traces without leaking protected inputs;
  • keep provider-specific mechanics out of workflow-facing contracts.

Primary Rule: Adapters Translate Execution, Not Semantics

Inquirium Core is a host-owned policy boundary. An adapter is an execution translator, not the owner of operation semantics. Operations such as generate, classify, embed, rerank, transform, train.adapt, and related verbs belong to Inquirium Core and its schemas. Provider-specific payload mapping belongs inside adapters.

Runtime lifecycle, health, supervision, process management, and transport mapping remain in model-runtime. Adapters do not receive ambient authority. Model output remains candidate evidence. Every denial, failure, degradation, fallback, continuation, and retry is a typed outcome. The preferred adapter is replaceable, auditable, and boring.

Adapter Vocabulary

runtime adapter translates request/result data between Inquirium, model-runtime, and one concrete model execution surface.

protocol gate filters and shapes model communication according to a protocol profile, role, and purpose.

tool bridge adapter maps external tools and resources to Orbiplex refs and capability refs.

agent bridge adapter maps an external agent or agent protocol into candidate evidence and typed outcomes.

model adaptation artifact is a LoRA/QLoRA adapter, prompt program, checkpoint, or other result of adaptation. It is not a runtime adapter.

provider worker is a process, local server, remote endpoint, or API that executes inference. It has no Orbiplex authority.

middleware-hosted runtime adapter is a runtime adapter implemented through the Orbiplex middleware hosting fabric. Its executor may be command_stdio, local_http_json, supervised http_local_json, an in-process handler, or a later compatible executor. This says how the adapter runs; it does not widen what the adapter may do.

Adapter Hosting and Authority Axes

Keep adapter hosting separate from adapter authority. A runtime adapter may be delivered as an operator-installed package, a bundled module, an in-process component, a one-shot command, an unmanaged local HTTP endpoint, or a supervised local HTTP service. Those shapes answer "how does it run?".

The semantic role answers a different question: "what may this component do?". An Inquirium runtime adapter may translate requests, invoke a selected provider worker, map provider responses into Inquirium outcomes, report health, expose conformance metadata, and use explicitly granted lease/artifact/status capabilities. It must not claim arbitrary middleware hooks, own workflow dispatch, choose model policy outside the host decision, mutate Orbiplex state, or turn provider-specific behavior into public Inquirium semantics.

Practical classification:

Axis Field or decision Meaning
Hosting hosting/kind and optional middleware executor config Process, transport, lifecycle, and packaging shape.
Adapter role adapter/kind, operations[], protocol family Which Inquirium execution surface is translated.
Authority effects/allowed, leases, egress, sandbox, conformance What side effects and data paths are allowed for this adapter instance.

Use the least powerful executor that can express the behavior. A pure mapping can be declarative. A one-shot local command can use command_stdio. A long-lived local model server usually belongs behind local_http_json if supervised elsewhere, or supervised http_local_json when the Node host should own readiness, restart, shutdown, and module reporting.

Adapter Implementation, Instance, and Runtime Candidate

Do not use "adapter" for every layer. The implementation should distinguish:

Layer Meaning Typical cardinality
adapter/ref Reusable code/package for one protocol family or execution interface. One implementation can support many instances.
adapter.instance/ref Configured and optionally supervised use of that implementation: endpoints, credentials, pools, queues, lifecycle, health, rate limits. One instance can expose many runtime candidates.
runtime/ref Host-visible routable execution candidate: adapter instance plus model binding, operation support, policies, health, and conformance. One runtime candidate should point to one model binding.
runtime.instance/ref Materialized live process, loaded model, session, or worker created by host/model-runtime supervision. Zero or more live instances may back one runtime candidate.
model.binding/ref Provider-facing model handle plus model/ref, digest/hash where available, default parameters, and constraints. One binding may be reused by compatible runtimes, but policy may still split them.
profile/ref Caller-facing policy class for model use. One profile can select among many runtime candidates.

The rule is: adapter per interface, adapter instance per lifecycle/trust boundary, runtime candidate per model-configuration. Do not create one adapter implementation per model unless protocol, trust boundary, isolation, or response normalization differs. Do not hide two routable models inside one runtime candidate merely because they share one server process or API standard.

Examples:

adapter/openai-compatible-http
  adapter.instance/remote-provider-a
    runtime/remote-provider-a-small-chat
      model.binding/provider-a-small-chat
    runtime/remote-provider-a-large-chat
      model.binding/provider-a-large-chat

adapter/local-http-model-server
  adapter.instance/local-gpu-server
    runtime/local-gpu-server-llm-a
      model.binding/llm-a
    runtime/local-gpu-server-llm-b
      model.binding/llm-b

The adapter instance may maintain shared HTTP clients, connection pools, queues, backpressure state, or a supervised server process. Those are performance and lifecycle mechanisms. The host and model-runtime still select a runtime/ref, not a hidden model inside the adapter.

Spawning follows the same rule. An adapter may implement load, spawn, unload, or pool mechanics, but the decision to materialize a routable runtime belongs to host-owned policy and model-runtime. A successful spawn or load produces an explicit runtime.instance/ref or updated runtime candidate state with health, model identity, resource budget, and conformance visible to Inquirium.

Correspondence With Common Agent-Orchestrator Layering

Existing agent orchestrators commonly separate provider configuration, model selection, execution backend, and interaction channel. Inquirium should treat that as a useful implementation pattern, not as its domain model. The mapping is:

Common orchestrator layer Inquirium / Orbiplex layer
Provider configuration, auth, discovery, endpoint defaults adapter/ref plus adapter.instance/ref
Provider-facing model id and per-model defaults model.binding/ref
Execution backend, local service, CLI wrapper, remote API client, worker pool runtime/ref and runtime.instance/ref
Conversation channel, chat ingress, UI, messaging bridge Host, Flow, Arca, Sensorium, or middleware outside Inquirium

This correspondence is intentionally not one-to-one. A provider-style adapter instance may expose many runtime candidates. A runtime candidate should still name one model binding and one policy bundle. A channel should not become the place where model selection, data leases, or inference policy are hidden.

When importing ideas from orchestrator designs, keep the invariant:

provider mechanics are adapter concerns;
model identity is a model-binding concern;
routable execution is a runtime-candidate concern;
conversation orchestration is not an Inquirium adapter concern.

The practical benefit is reuse without flattening. One adapter implementation can support multiple configured instances, one instance can reuse clients, process supervision, queues, and caches, and the host can still audit and select each routable model configuration explicitly.

Runtime Candidate Registry as Data

Inquirium should separate the registry of execution candidates from the semantics of operations. A runtime candidate may have a canonical runtime/ref, aliases for configuration migration, readiness metadata, configuration, and an explicit selection priority. It must not define what generate, embed, classify, or train.adapt means.

The selection contract should be data-driven:

  • profile/ref;
  • operation;
  • operation requirements;
  • candidate runtime list;
  • selected runtime;
  • typed rejection.

Aliases are useful for compatibility, but they must not change operation meaning. Automatic priority should be an explicit field such as selection/weight or selection/order, not an artifact of file loading order.

Capability Metadata for Model and Runtime Selection

A runtime should declare its capabilities as data, not as hidden branches in an adapter. The minimal capability record should include:

  • supported operation classes;
  • input and output modalities;
  • context limits;
  • input and output byte limits;
  • locality class;
  • egress policy;
  • trace policy;
  • retention profile;
  • resource budgets;
  • cost class;
  • result class.

Model selection remains host policy, but host policy should operate over explicit capability records. The host can then compare operation requirements against runtime capabilities without knowing a concrete provider protocol. This preserves the what and how separation: the caller describes the act of inquiry, and lower layers choose how to execute it.

supports-seed? declares whether a runtime accepts a seed and whether that seed influences output. Determinism remains best-effort, not a contract. The same prompt, seed, parameters, and model may still yield different outputs because of floating-point batch effects, parallel GPU reduction, or provider-side nondeterminism outside the node's control. Replay equivalence for generative outputs is limited to local runtimes with an explicit deterministic mode. Remote runtime conformance should rely on schema checking, refusal mapping, and metadata accounting, not bit-identical output.

Inquirium Adapter Manifest

Every adapter should have a manifest as data. The manifest is an auditable description of declared adapter capabilities, not a promise hidden in code. The host uses the manifest for selection, configuration validation, conformance test generation, and operator diagnostics.

Minimal shape:

InquiriumAdapterManifestV1 {
  adapter/ref   # implementation/package identity
  adapter/kind:
    local-http
    | remote-http
    | command-stdio
    | openai-compatible
    | open-inference-protocol
    | mcp-bridge
    | a2a-bridge
    | evaluator
    | artifact-transformer

  version
  hosting/kind:
    in-process
    | middleware-hosted
    | external-endpoint
  middleware/executor?:
    in_process
    | command_stdio
    | local_http_json
    | http_local_json
    | json_e_flow
  implementation/ref?

  operations[]:
    generate
    | classify
    | embed
    | summarize
    | rerank
    | transform
    | image.generate
    | image.edit
    | audio.transcribe
    | audio.synthesize
    | train.adapt
    | batch.embed

  modalities/input[]
  modalities/output[]
  request/schema-refs[]
  result/schema-refs[]

  streaming/support
  batch/support
  tool-call/support
  structured-output/support

  capabilities {
    context/max-tokens?
    input/max-bytes?
    output/max-bytes?
    embedding/dimensions?
    image/max-pixels?
    audio/max-duration-ms?
    supports-seed?
    supports-json-schema?
    supports-logprobs?
    supports-cache-prefix?
    returns/plan?
  }

  policy {
    locality/class
    egress/class
    retention/profile
    trace/default-level
    sandbox/profile
    data-lease/required
    tool-policy/default
  }

  budgets {
    timeout/default-ms
    timeout/max-ms
    retry/max-attempts
    concurrency/max
    cost/class?
    rate-limit/ref?
  }

  health {
    probe/kind
    probe/interval-ms
    readiness/criteria
    liveness/criteria
  }

  security {
    requires-secret-ref?
    secret/scope?
    supply-chain/provenance-ref?
    allowed-model-hashes[]?
    disallowed-model-hashes[]?
  }

  conformance {
    fixture-set/ref
    last-report/ref?
    required-tests[]
  }
}

The adapter manifest is implementation-level unless it explicitly states otherwise. Adapter instance configuration and runtime candidate configuration should be separate records owned by host/model-runtime configuration:

InquiriumAdapterInstanceV1 {
  adapter.instance/ref
  adapter/ref
  hosting/kind
  lifecycle/config
  endpoint/config?
  auth/ref?
  egress/policy-ref?
  resource/policy-ref?
  health/config?
  pools/config?
}

InquiriumRuntimeCandidateV1 {
  runtime/ref
  adapter.instance/ref
  model.binding/ref
  profile/refs[]
  operations[]
  defaults/params
  policy/refs
  conformance/ref?
  health/status
}

InquiriumModelBindingV1 {
  model.binding/ref
  model/ref
  provider/model-name
  model/hash?
  defaults/params
  constraints
}

The manifest should be loaded from the host-owned module store or an equivalent controlled space, not from an incidental directory. An adapter without a manifest is not routable.

protocol gate is not a normal runtime adapter. It is a host-owned policy component driven by ProtocolProfileV1. It may have its own policy/profile manifest, but it should not be treated as an interchangeable provider worker. An evaluator adapter is allowed, but only as a separate class with bootstrap conformance and without authority to make itself routable.

Adapter Lifecycle and Routability

An adapter should not be treated as binary state: present or missing. The host should distinguish installation, configuration, health, conformance, and routability.

Minimal lifecycle:

AdapterState =
  Discovered
  | Installed
  | Configured
  | ConformancePending
  | Healthy
  | Routable
  | Degraded { reason }
  | Disabled { reason }
  | Revoked { reason }
  | Deprecated { replacement/ref? }

Healthy does not imply Routable. An adapter may answer a health probe and still fail egress, retention, sandbox, conformance, secret, model hash, or profile policy.

Route decision:

AdapterRouteDecision =
  Routable { adapter/ref, runtime/ref, profile/ref, constraints }
  | NotRoutable {
      code:
        missing-manifest
        | invalid-manifest
        | conformance-missing
        | conformance-failed
        | runtime-unhealthy
        | policy-denied
        | egress-denied
        | sandbox-denied
        | secret-missing
        | model-hash-denied
        | deprecated
        | revoked
      remediation?
    }

The route decision should be visible through inquirium.runtime.status, without leaking secrets or protected prompts.

Provider Protocol Families

Adapters should be grouped by protocol family, not by provider brand.

openai-compatible means a family of runtimes that implement some subset of common chat, completions, responses, or embeddings endpoints. Adapters in this family must declare exactly which endpoints and fields are supported, because compatibility is often partial.

open-inference-protocol is a family for a classical inference data plane, especially where health, metadata, and inference endpoints matter across multiple serving frameworks.

command-stdio means a local worker. It requires strict limits for process lifecycle, stdout, stderr, timeout, cwd, env, lease paths, and sandbox.

local-http means a local model server without full compatibility with the openai-compatible family.

remote-http means a remote API, always requiring an egress decision and secret scope.

Two providers in the same family must not be assumed to support the same fields, streaming behavior, tool calls, multimodal input, structured output, or usage accounting. The adapter manifest must say this explicitly.

Embedding Operation Contracts

embed and batch.embed should be modeled as separate operation contracts, not as hidden variants of chat generation. A direct embed request accepts bounded inline text inputs and returns bounded inline vectors:

{
  "schema": "inquirium.embed.request.v1",
  "operation": "embed",
  "model": "provider-facing-model-name",
  "input": { "texts": ["alpha", "beta"] },
  "parameters": {
    "dimensions": 384,
    "normalize": true,
    "encoding_format": "float"
  },
  "metadata": {}
}

The matching inquirium.embed.response.v1 carries vectors[] with stable source indexes, a required dimensions field, provider-neutral usage, and redacted diagnostics. Implementations must reject zero dimensions, empty inputs, dimension mismatches, and non-finite vector values. Embeddings inherit the retention and egress boundary of their source material; they are derived content, not neutral telemetry.

batch.embed uses direct data-plane leases. The request carries source_lease_refs[] and an optional output_lease_ref; the response returns inquirium.batch-embed.response.v1 with an artifact_ref, dimensions, optional item_count, optional digest, usage, and diagnostics. The host remains responsible for issuing leases, validating the produced artifact, writing it through the object store, and binding provenance to runtime/ref, model.binding/ref, and the operation id.

Fail Closed When Resolving Profile and Runtime

Missing profile, missing candidate, missing configuration, unhealthy runtime, policy denial, egress denial, or missing lease should all be separate rejection classes. They should not degrade into "try the default model", because that mixes convenience with authorization.

Runtime selection result:

RuntimeSelection =
  Selected {
    runtime/ref
    profile/ref
    config/ref
    capabilities
  }
  | Rejected {
    code: missing-profile
        | no-runtime-candidate
        | runtime-not-configured
        | runtime-unhealthy
        | policy-denied
        | egress-denied
        | lease-denied
    candidate/ref?
    remediation?
  }

This type forces callers to handle rejection as part of the contract. None, empty string, or a generic exception are too weak for an authorization boundary.

Quality Contract, Switch-On, Model Profile, and Dispatch Policy

Quality-driven dispatch requires three separate decisions. Entangling them creates most runtime confusion.

Quality Contract: done.must / should / score-min

Every request should be able to declare when it is done:

done:
  must[]         # closed required predicates: schema-valid, patch-applies, ...
  should[]       # soft predicates: tests-pass, no-hallucinated-refs, ...
  score-min?     # optional evaluator/judge threshold

must is fail-closed. Missing must means Rejected with a concrete failed-predicate, not a silent retry. should is a quality target. Missing should does not block acceptance, but it influences retry/repair decisions and is visible in trace. score-min activates an evaluator path. Falling below the threshold triggers retry, fallback, or repair according to dispatch policy.

Without this contract, "done" becomes adapter-specific and inconsistent.

Switch-On: Configurable Failure Taxonomy

Each dispatch policy declares a closed list of failure classes that trigger fallback to another runtime:

dispatch/switch-on:
  - context-too-large
  - rate-limited
  - runtime-transient-unavailable
  - schema-invalid-repairable
  # provider/refusal intentionally absent: terminal by default

Classes outside this list must not automatically enter retry or fallback. Dispatch policy should separate three decisions:

  • retry/current-runtime;
  • fallback/other-runtime;
  • fail/request.

provider/refusal is terminal for the whole request unless policy explicitly defines another interpretation, such as retry after a protocol profile change or human review. This keeps "which failure should retry, fallback, or fail" in configuration, not in adapter-specific code.

Model Profile vs Dispatch Policy Profile

These are orthogonal axes.

Axis Decision
model/profile-ref Selects model/runtime capability class, locality, cost tier, context window, and modalities. Chooses who executes.
dispatch/policy-ref Selects retry, fallback, repair aggressiveness, score-min, hedging, and time budget. Chooses how hard to pursue quality.

Three named dispatch policies cover most cases:

Profile Retries Fallback hops Score-min Hedging
low-latency 0-1 1 n/a none
balanced 2 2 0.65 opt-in per intent
high-quality 3 3 0.85 hedging + race

Callers may select model and dispatch policy independently:

request:
  model/profile-ref: "local-small-fast"
  dispatch/policy-ref: "high-quality"

Profiles inherit declaratively through configuration tables, not code. A new profile should be a config record, not a runtime branch.

Request Pipeline as Explicit Stages

An Inquirium request should not flow directly from caller input to adapter. Each request should pass through an explicit pipeline. This keeps adapters simple and gives the host one control point for safety, retention, trace, and cost.

Minimal pipeline:

1. admit_request
   - schema gate
   - caller capability
   - operation class
   - idempotency key

2. resolve_policy
   - purpose
   - profile/ref or capability class
   - locality/egress/retention/trace
   - budget and autonomy

3. resolve_inputs
   - inline inputs under size limit
   - artifact refs
   - dataset refs
   - leases
   - declared/detected MIME

4. protocol_gate.input
   - protocol profile
   - input rails
   - prompt-injection classification
   - context visibility

5. shape_context
   - selected context refs
   - redaction
   - instruction hierarchy
   - prompt/runtime representation
   - token estimates

6. select_runtime
   - candidates
   - health
   - adapter route decision
   - model/profile decision

7. invoke_adapter
   - timeout
   - cancellation
   - retries if allowed
   - streaming/backpressure if applicable

8. protocol_gate.output
   - schema validation
   - output rails
   - repair-once if policy allows
   - final projection

9. persist_effects
   - artifact manifest
   - provenance
   - usage
   - feedback events
   - metadata trace

10. return_outcome
    - Completed
    - Degraded
    - Rejected
    - Deferred
    - Cancelled
    - Failed

Every stage returns a typed outcome. The adapter starts work only after authorization, policy, input resolution, protocol gate, and context shaping.

Direct Data Plane Through Leases

The host does not need to proxy every large sample, image, audio file, dataset shard, or tensor. It must remain the control plane: authorize the operation, classify data, issue leases, select runtime, enforce sandbox, egress, budget, retention, and trace, then record manifest, provenance, and status.

A lease should describe read and write scope separately, deadline, data class, operation id, sandbox profile, egress policy, and allowed data resolver. The worker receives a handle or scoped path after validation, not ambient filesystem access. If direct data plane becomes common, it should enter the Inquirium schema rather than living as an adapter exception.

Artifact References Instead of Raw Paths

Inquirium should prefer artifact/ref, dataset/ref, and model-artifact/ref over raw paths. A reference is typed and resolved by the host. The worker sees only the validated result: scoped path, object handle, query handle, or content-addressed manifest.

Resolution should depend on an explicit reference scheme:

  • artifact://... resolves through the artifact registry;
  • dataset://... resolves through the dataset registry;
  • file://... resolves through lease paths;
  • inline://... resolves through byte limits;
  • remote://... resolves through the egress gate.

There is no universal "open whatever" resolver. Each source class has its own safety rules.

Byte Limits and Typed Input Errors

Input and output limits are part of the operation and profile contract:

  • max/input-bytes;
  • max/output-bytes;
  • max/context-tokens;
  • read-idle-timeout-ms;
  • mime/allowed;
  • artifact/class.

The limit is not a transport detail, because violating it affects safety, cost, and determinism.

On overflow, the stream should be closed or cancelled and the whole operation rejected. Silent prefix truncation is allowed only as an explicit transformation policy with trace. Default semantics are: reject the whole input, destroy the source, and record a metadata-only incident.

read_with_limit(stream, max_bytes):
  total = 0
  for chunk in stream:
    total += len(chunk)
    if total > max_bytes:
      cancel_or_destroy(stream)
      return Rejected { code: too-large, size: total, max: max_bytes }
    append(chunk)
  return Completed { bytes }

Sniffing and Input Classes Without Full Loading

Input type should be detected from a small byte prefix, declared header, or manifest, not by loading the whole artifact into memory. For large data, sniffing is an auxiliary classification, not proof. The final contract should include declared/mime, detected/mime, artifact/class, and the policy decision.

If declaration and detection conflict, the runtime should not decide which one to trust. The host should return a typed rejection or route the operation to a review path, depending on sensitivity class and profile.

Atomic Artifact Writes

Inquirium outputs such as images, transcripts, embedding shards, model adaptation artifacts, checkpoints, and metrics should enter the artifact store through atomic writes. The pattern is: sanitize name, write a sibling temporary file in the target directory, require exclusive create, compute hash, atomically rename, then publish the manifest.

The manifest should include at least:

  • artifact/ref;
  • hash;
  • byte size;
  • MIME or artifact kind;
  • operation/id;
  • runtime/ref;
  • profile/ref;
  • lease/id;
  • sensitivity class;
  • retention profile.

TTL must not be a hidden cleanup rule. It should derive from retention policy and be visible in provenance.

Model Card, Training Job, and Eval Report as Required Adaptation Artifacts

If Inquirium creates a model adaptation artifact, checkpoint, LoRA/QLoRA artifact, prompt program, or other model artifact, the result should not be just a file. It should have a model card, training job record, and eval report.

Minimal relationships:

TrainingJob {
  job/id
  base-model/ref
  method: lora | qlora | prompt-optimization | other
  dataset/refs[]
  protocol/profile-ref?
  policy/profile
  operator/ref
  status
  eval/report-ref?
  output/model-artifact-ref?
}

ModelCard {
  model-artifact/ref
  base-model/ref
  adapter/id?
  intended-use[]
  out-of-scope[]
  limitations[]
  excluded-data-classes[]
  known-risks[]
  evaluation/ref
  provenance/refs[]
  policy/profile
  deployment/scope
  status
}

Protocol Gate may generate feedback/event and training-candidate records, but it must not adapt weights or publish model artifacts without a separate inquirium.train.adapt operation and policy approval.

inquirium.train.adapt is idempotent by a deterministic key computed from content-addressed inputs and the training environment, not loose refs. The minimal key includes:

  • base-model/ref as digest or verified model hash;
  • method;
  • dataset-manifest/ref and manifest hash;
  • canonical sample ordering;
  • hyperparameters;
  • seed;
  • training-code/ref;
  • training-code/hash;
  • runtime or container image digest;
  • training toolchain version;
  • policy profile;
  • relevant feature flags.

Retrying the same training job with an identical key returns the existing model-artifact/ref instead of producing a duplicate. The idempotency key is sha256 over canonical-json::JcsV1 of these fields. If any element changes, the request is a new training job, even under the same caller-provided job/id. The ledger returns an idempotency conflict instead of silently publishing a second model artifact under the same alias.

Dataset Manifest as Content-Addressed Artifact

A dataset used by train.adapt should have a manifest whose fields guarantee a reproducible build:

DatasetManifestV1 {
  manifest/ref
  dataset/refs[]
  split: train | valid | test
  counts {
    rows
    rows-by-label?
    rows-by-source-kind?
  }
  file-hashes[]
  filter-spec/ref
  time-window { from, to }
  redaction/audit {
    pii-replacements
    secret-strips
    rule-counters
  }
  config-snapshot-hash
  builder-version
  build-seed
}

Repeating the build with the same sources, filter spec, seed, and config snapshot should produce an identical manifest: counts, file hashes, and audit counters. Manifest mismatch for the same inputs is a pipeline bug, not noise. CI should fail closed when the config snapshot hash changes from the last accepted build unless the change is explicitly accepted.

Promotion Gate as a Separate Operation

inquirium.eval.gate is a separate operation from train.adapt. It is never implicit.

PromotionGateRequestV1 {
  candidate/model-artifact-ref
  baseline/model-artifact-ref?
  eval/suite-refs[]
  thresholds {
    metric-name -> { min?, max?, regression-tolerance? }
  }
  blocking[]
  non-blocking[]
}

PromotionGateResultV1 {
  decision: accepted | rejected
  per-metric[] { name, value, baseline?, threshold, verdict }
  rejection-reasons[]
  report/ref
}

The report is deterministic. Replay should produce identical output. CLI or automation wrappers should return non-zero exit code for rejected, enabling CI integration. rejection-reasons is a closed enum, not a free-form string, so operator dashboards can map it to actionable remediation.

Event Stream with Correlation and Metadata

Inquirium should emit stable events:

  • request.started;
  • runtime.selected;
  • lease.issued;
  • operation.deferred;
  • usage.metrics;
  • artifact.produced;
  • request.failed;
  • request.completed.

Each event should include session/id when relevant, operation/id, attempt/seq, timestamp, operation class, and a metadata-bounded payload.

Trace should answer: what happened, why, which input refs were used, which policy applied, and which effects were produced. By default it must not contain prompts, samples, audio, images, secrets, or full model responses. Richer trace requires a separate grant, retention profile, and data classification.

Trace Metadata and Trace Details

Trace should be structurally split into trace/metadata and trace/details. Metadata is default, bounded, and exportable: duration, byte length, token usage, selected runtime, denial code, lease refs, artifact refs. Details is optional and may contain protected material only when policy, operator grant, or caller grant allows it.

This is stronger than a list of fields not to log. The privacy boundary is a data contract. Runtime adapters do not need to infer which fields are private; they receive a decision: metadata-only or details-allowed with a concrete retention profile.

Event, Span, and Provenance Are Different Views

Inquirium should distinguish three causal accounting views.

Event is a discrete domain occurrence such as request.started, runtime.selected, or artifact.produced.

Span is execution observability: duration, errors, retries, provider family, model, token usage, and streaming status.

Provenance is the relation between entities: a request used artifact refs, model, adapter, and dataset; an activity generated a result or artifact.

Minimal mapping:

Inquirium request/operation = PROV Activity
Input artifact/context/dataset/model artifact = PROV Entity used
Output artifact/result/eval report = PROV Entity generated
Runtime adapter/model profile/operator/host policy = PROV Agent associated
Fine-tuned adapter/model-artifact = Entity derivedFrom base model + dataset

Trace metadata may contain refs and hashes. Trace details may contain protected material only with a separate grant and retention profile.

Tool Delegation as Policy

Model-backed inquiry may use tools, but tools are not ambient model capabilities. Every operation should carry explicit allowed/tools, allowed/resources, disallowed/resources, cost, deadline, and autonomy level. Tool output is an input to inference or evidence artifact, not authorization to mutate host state.

The tool list should be an enumeration of names or capability refs, not a regex or "everything except" category. Example levels: none, safe-read-only, owner-approved. A new tool class is denied until explicitly added to policy.

Effects Are Orthogonal to Tools

allowed/tools answers which capability provider the model may call. It does not answer which host effects are allowed. These are separate axes:

Axis Question Scope
allowed/tools What may the model call? Enumeration of capability refs.
effects/allowed What host effect may happen? Scoped grants such as EffectGrant { kind, scope/ref, lease/ref?, destination/ref?, deadline, budget }.

EffectGrant.kind should be a closed enum, such as fs/read, fs/write, process/run, net/egress, identity/sign, agora/publish, ad/dispatch, or none. The kind is never sufficient by itself. Every grant needs a scope, and where relevant a lease, destination, deadline, and budget.

Default effect allowlists are narrow:

profile/default:
  effects/allowed: []

profile/safe-inference:
  effects/allowed:
    - { kind: fs/read, lease/ref: input-lease, scope/ref: request-inputs }

profile/training:
  effects/allowed:
    - { kind: fs/read, lease/ref: dataset-lease, scope/ref: training-inputs }
    - { kind: fs/write, lease/ref: artifact-output-lease, scope/ref: artifact-store }

profile/agent-bounded:
  effects/allowed:
    - { kind: fs/read, lease/ref: input-lease, scope/ref: request-inputs }
    - { kind: net/egress, destination/ref: approved-endpoint, budget: bounded }

Side effects require both a tool grant and an effect grant. A tool that can write files requires fs/write, and that effect is valid only within its lease and scope. This avoids granting a tool while accidentally allowing out-of-scope effects.

MCP and A2A as Bridges, Not Core Semantics

Inquirium may use agent protocols, but it should not import their semantics into core. MCP and A2A are edge bridge adapters.

MCP bridge maps external resources, prompts, and tools to Orbiplex resource/ref, prompt-template/ref, and tool/ref. It requires capability negotiation before use; respects cancellation, progress, errors, and logging; does not give the model ambient tool access; and sends every tool call through allowed/tools, budget, deadline, trace, and output validation.

A2A bridge treats an external agent as an untrusted provider worker or remote agent endpoint. It declares capability metadata, skills, and routability through its manifest. It does not assume shared memory, shared secrets, or shared policy. Its answer is candidate evidence. Each delegation carries an egress decision, retention profile, data classification, and typed outcome.

MCP = agent/tool bridge.
A2A = agent/agent bridge.
Inquirium Core = local host policy boundary.

The bridge may speak an external protocol. Core remains stratified, local, and auditable.

Autonomy Modes for Tool Results

Not every tool result should automatically trigger another model response. Inquirium should separate:

  • tool-only: tool result is an artifact and the operation ends;
  • single-turn: one tool -> model -> result round is allowed;
  • multi-turn: a controlled loop is allowed under max steps, cost, and time.

The model does not decide on its own that it may continue after each tool. The host issues budget and step limits before effects.

Protocol Gate

Protocol Gate is a host-owned policy boundary for communication with the model. It is not a model, agent, or post-training process. If the component also collects feedback for later adaptation, the name Protocol Tutor may be used for the feedback/evaluation part only.

Protocol Gate performs four functions:

  • context injection: protocol, role, session purpose, bootstrap instructions, and instruction hierarchy;
  • input rail: accepts, rejects, or transforms input before the model;
  • output rail: checks model response against protocol, schema, safety, and allowed actions;
  • feedback recorder: records metadata about violations, repairs, and decisions as evaluation material or, with a separate grant, as training candidates.

Minimal profile:

ProtocolProfileV1 {
  protocol/ref
  purpose
  role/ref
  instruction/hierarchy
  session/bootstrap
  turn/prefix-template?
  allowed/outputs
  disallowed/outputs
  response/schema-ref?
  refusal/schema-ref?
  tool-policy/ref?
  safety/profile
  retention/profile
  trace/profile
  revision/policy:
    none | repair-once | ask-model-to-revise-once | human-review
  feedback/policy:
    metadata-only | details-with-grant | training-candidate-with-grant
}

Input decision:

ProtocolInputDecision =
  Accepted {
    shaped_request
    protocol/ref
    protocol/epoch
    instruction/hash
    cache/key?
  }
  | Rejected { code, reason, remediation? }
  | NeedsReview { review/ref, reason }

Output decision:

ProtocolOutputDecision =
  Accepted { projected_result, protocol/score?, violations[] }
  | ReviseOnce { critique, revised_request }
  | Rejected { code, violations[], remediation? }
  | NeedsReview { review/ref, violations[] }
  | Degraded { fallback, reason }

The gate may start a session with a bootstrap prompt describing protocol, purpose, roles, output format, and boundaries. It may add turn-level instructions, but only through context/shaping, not through a private string builder in the adapter.

For sessions with cache or KV state, include:

protocol/epoch
instruction/hash
cache/policy
cache/key
kv-cache/ref?
kv-cache/lease?

Changing protocol, role, instruction, or material policy increments protocol/epoch and forces a new bootstrap. This lets prompt-prefix caching, session bootstrap, and later KV mechanisms coexist without pretending the model has been trained.

Do not call ordinary accept/reject behavior by the gate "training" unless an approved dataset is produced and a separate inquirium.train.adapt operation is started. Runtime behavior here is protocol conditioning, context shaping, output critique, or feedback recording. Training/adaptation begins only with a dataset handle, model-artifact/ref, eval report, and separate grant.

Instruction Hierarchy and Instruction Conflict

Inquirium should have its own explicit instruction hierarchy independent of any provider format. The adapter maps that hierarchy to a provider representation, but it must not flatten it without trace.

Proposed order:

host-root policy
> organ policy
> operation policy
> protocol profile
> model profile
> caller purpose
> session bootstrap
> user/input messages
> retrieved context
> tool results

Conflicts resolve upward in the hierarchy. Tool results and retrieved context are data, not instructions, unless the host explicitly transforms them into instruction-bearing content.

context/shaping should emit:

InstructionAssembly {
  instruction/hash
  hierarchy/version
  accepted_layers[]
  rejected_layers[]
  conflict/resolutions[]
  provider/rendered-form/ref?
}

Prompt and Context Shaping as a Separate Stage

Prompt or runtime input construction should not be a private string builder in the adapter. Inquirium should have a context/shaping layer that accepts the question, messages, refs, context window, redaction profiles, style hints, limits, and visibility policy, then returns admitted context, rejected refs, token estimates, and runtime representation.

The contract separates:

input data -> context selection -> redaction -> template -> runtime input -> result projection

Different runtimes may receive different formats, but callers see stable semantics: which context was admitted and which part of the result may leave Inquirium.

Structured Output and Schema-Constrained Results

For classify, transform, rerank, tool planning, and protocol conformance, Inquirium should prefer output contracts based on schema, not prompt wording. JSON Schema or a local equivalent should be part of the request, profile, or capability.

Model output passes through projection:

OutputProjection =
  Parsed { value, schema/ref }
  | Plan { flow/ref, bindings? }
  | Stream { stream/ref, schema/ref? }
  | Refusal { reason, provider/refusal? }
  | Invalid { schema/ref, violations[] }
  | Repairable { violations[], repair-policy }
  | Unsafe { policy/ref, reason }

If an adapter/provider supports native structured outputs, the manifest should declare that. If not, Inquirium may use parsing and repair-once, but repaired output must remain marked in trace.

Capability as Compiler: Plan as Candidate Return

Parsed and Stream are straightforward. Plan is less obvious and must be treated as candidate Flow IR, not an executable command. An adapter may propose a plan for a composite operation, but the host-owned Flow IR compiler/gate decides whether the plan fits the original request contract.

Examples:

  • inquirium.transform proposes classify -> reformat per branch;
  • inquirium.train.adapt proposes redact -> split -> train -> eval -> gate;
  • compound inquirium.summarize proposes chunk -> summarize each -> aggregate.

The host validates Plan through Flow IR schema gate, policy/budget check, effect/tool grants, and plan limits: max-depth, max-nodes, max-tool-calls, max-cost, max-time, and no-implicit-recursion. Sub-operations must not exceed the original operation, purpose, allowed/tools, effects/allowed, egress, retention, and policy. Callers do not need to know sub-steps, but adapter authority must not widen.

Invariant: Plan as return requires the adapter manifest to declare returns/plan: true. Without that, the host rejects a Plan response as Invalid. Even with that capability, the plan remains CandidatePlan for host acceptance, not a self-executing program.

Token Estimation Before and After Context Assembly

Token limits should not be checked only on the final prompt. In long sessions, the final prompt may be shorter than source state, while the runtime or session layer still carries a larger context. Distinguish:

  • input context space limit;
  • session/transcript limit;
  • tool result limit;
  • final runtime input limit.

context/shaping should return multiple estimates and identify which estimate is authoritative for rejection:

TokenCheck =
  Accepted { assembled_tokens, source_tokens, tool_tokens }
  | Rejected {
      code: context-too-large
      max_tokens
      assembled_tokens
      source_tokens
      tool_tokens
    }

Projection Mode for Long-Lived Sessions

Long-lived sessions need an explicit context projection mode. per-turn means each operation receives a fresh complete prompt or runtime input. session-bootstrap means the host injects context once for an epoch and later steps send only deltas. Epoch change forces context reload.

This supports runtimes with session memory, prompt cache, or conversation state without exposing those mechanics to callers. The caller sees session/id, context/epoch, and retention policy, not backend mechanics.

Prompt Cache and KV Cache as Controlled Resources

Prompt prefix cache, session state, and KV cache should not be treated as hidden model memory. In Inquirium, they are controlled session resources.

Minimal fields:

CachePolicy {
  cache/class: none | prompt-prefix | kv | provider-managed
  retention/max-age-ms
  scope: operation | session | protocol-epoch | profile
  key/material:
    instruction/hash
    protocol/epoch
    context/hash
    model/ref
    adapter/ref
  privacy/class
  evict/on-policy-change: true
}

Cache hit/miss should be metadata trace, not an adapter detail. If the provider maintains its own cache, the adapter must declare that in the manifest and return usage metadata when available.

Streaming and Sessions as a Separate Operation Class

Streaming audio, realtime transcription, live multimodal interactions, and interactive tool use should be modeled as session operations, not blocking requests. Minimal lifecycle:

created -> ready -> streaming -> completed | cancelled | failed

Every transition should be a trace event with a typed reason.

Minimal session contract:

  • session/id;
  • operation/id;
  • transport/class;
  • input-sink policy;
  • output-sink policy;
  • cancel semantics;
  • usage metrics;
  • close reason;
  • retention profile.

Cancellation has at least two levels:

  • interrupt-output, which stops current output;
  • terminate-session, which ends the whole operation.

Backpressure and Output Sink Availability

Streaming operations should check whether the output sink still exists and can accept data. If the sink is closed, the runtime should not continue producing expensive output without a policy decision. This matters for audio, video, and multimodal streams, where missing backpressure quickly becomes cost and privacy risk.

Session contracts should distinguish:

  • sink-closed;
  • transport-closed;
  • provider-closed;
  • policy-cancelled;
  • operator-cancelled.

These classes matter for retry, accounting, and diagnostics.

Inquirium Flow IR: Flow as Data

Inquirium should not become a general workflow engine, but it should have a small flow representation for multi-step operations: guard -> context selection -> model -> tool -> validation -> artifact -> review. Such representation helps debug, replay, test, and move execution into bounded deferred operations.

Minimal model:

InquiryFlowV1 {
  flow/id
  operation/id
  purpose
  nodes[]
  edges[]
  budgets
  policy/ref
  trace/ref?
}

FlowNode =
  InputNode
  | PolicyDecisionNode
  | ProtocolGateNode
  | ContextShapingNode
  | RuntimeSelectionNode
  | ModelInvocationNode
  | ToolInvocationNode
  | HumanReviewNode
  | ArtifactWriteNode
  | EvaluationNode
  | OutputProjectionNode

FlowEdge =
  depends-on
  | consumes
  | produces
  | guards
  | delegates-to
  | resumes-from
  | retries-after
  | derives-from

NodeState =
  Pending
  | Running
  | Completed
  | Rejected
  | Degraded
  | Deferred
  | Cancelled
  | Failed

By default, flow is a DAG. Controlled agent loops are allowed only as a bounded controller with max_steps, max_cost, max_time, allowed_tools, termination_condition, and explicit review policy. Anything with dependency, cost, retry, tool call, or output artifact should exist as a node/edge in Flow IR rather than hidden call stack.

Adapter Conformance Before Routability

An adapter may be installed without being routable. It should not become routable without a minimal conformance report. Conformance tests should be generated from the manifest and fixture set.

Minimal report:

AdapterConformanceReportV1 {
  report/ref
  adapter/ref
  adapter/version
  runtime/ref?
  fixture-set/ref
  generated-at
  outcome: passed | failed | partial
  tests[] {
    name
    operation
    outcome
    failure/class?
    trace/ref?
  }
}

Minimal test classes:

  • schema gate;
  • provider mapping;
  • refusal mapping;
  • timeout and cancellation;
  • retryable vs terminal failures;
  • streaming chunks;
  • backpressure and sink-closed;
  • structured output;
  • tool call denial;
  • allowed tool call;
  • trace redaction;
  • egress denied;
  • lease denied;
  • model hash allowed/denied;
  • idempotency/replay;
  • cost/token usage accounting when declared.

Eval Before Route

A new adapter, protocol profile, or model profile should not automatically enter the production routing pool. It should pass minimal evals first.

For an adapter:

  • conformance fixtures;
  • mapping errors;
  • refusal mapping;
  • redaction tests;
  • latency/budget smoke test;
  • cancellation/retry test.

For a protocol profile:

  • protocol-following fixtures;
  • prompt injection fixtures;
  • refusal fixtures;
  • schema-valid output fixtures;
  • over-answering and under-answering fixtures;
  • tool-use denial fixtures.

For a model adaptation artifact:

  • base eval report;
  • protocol conformance eval;
  • regression against safety fixtures;
  • data provenance check;
  • model card update.

Routing policy may admit a profile in modes such as canary, owner-only, local-only, metadata-trace-only, or disabled.

An evaluator adapter is itself model-backed and must not evaluate itself. Evaluator adapters bootstrap through deterministic offline fixtures: rule-based, schema-only, regex/AST-level fixtures that can be verified without calling another model. Only this fixed fixture set may grant evaluator Routable state. Model-evaluates-model becomes allowed only after this step and with explicit evaluator/provenance-ref pointing at the evaluator conformance report.

Constitution and Critic as Data

An evaluator needs two data shapes whose contracts are stable across implementations:

ConstitutionV1 {
  constitution/ref
  rules[] {
    rule/id
    kind: schema | predicate | rubric
    severity: must | should | nice
    description
    spec/ref
  }
  scoring/policy?
}

EvaluationResultV1 {
  evaluation/id
  evaluator/ref
  constitution/ref
  target/ref
  pass: true | false
  score?
  per-rule[] {
    rule/id
    verdict: pass | fail | skipped
    reason?
    evidence/ref?
  }
  reject-or-repair?
}

Constitution is data, not code. It can be changed without rebuilding the evaluator. If a rule points to a predicate ref, that predicate must come from a signed, host-owned registry with versioning, conformance, and explicitly declared effects. Adapters and protocol profiles must not provide arbitrary code as predicates.

Critic is an evaluator invoked automatically after inference. It returns EvaluationResultV1, which the host uses for retry, repair, or fallback under the quality contract. Per-rule audit trails go through trace details with a separate grant, or through trace metadata as verdict summary without evidence.

This makes "the evaluator applies some rules" a declarative contract. Operators can see which rules failed and change constitution data without rewriting evaluator code.

Threat Matrix for Model-Backed Inquiry

Every new operation and adapter should pass through a small threat matrix. This is not a full formal security analysis of Orbiplex, but a minimum hygiene check for model-backed inquiry.

Prompt injection
  -> context separation, instruction hierarchy, input rails, tool isolation

Insecure output handling
  -> structured output, output projection, sandboxed side effects, no authority

Training/data poisoning
  -> dataset provenance, model card, eval report, operator grant

Model denial of service
  -> token/byte/time/cost budgets, concurrency caps, cancellation

Supply chain risk
  -> adapter manifest, module provenance, model hash allow/deny, model cards

Sensitive information disclosure
  -> metadata-only trace, redaction, retention profile, egress gate

Insecure plugin/tool design
  -> allowed/tools enumeration, tool call policy, no ambient capabilities

Excessive agency
  -> autonomy level, max steps, human review, no self-authorizing agents

Overreliance
  -> model output as candidate evidence, downstream policy decides

Model theft / model artifact leakage
  -> model-artifact refs, scoped leases, egress control, artifact retention

Every threat should have at least one negative fixture in conformance or acceptance tests.

Fallbacks as Degraded Results

Fallback is not successful inference. If the model produced no result, the operation was interrupted, the runtime was unavailable, or output could not be projected, the outcome should be degraded, rejected, cancelled, or failed, not completed.

Practical shape:

InquiriumResult =
  Completed { payload, usage, model/used }
  | Degraded { fallback, reason, failure/class?, runtime/used? }
  | Rejected { code, message, remediation? }
  | Deferred { operation/id, status/ref }
  | Cancelled { reason }
  | Failed { error/class, retryable }

Downstream can then distinguish candidate evidence, fallback text, policy denial, asynchronous operation, and failure. This is required for auditability and reliable automation.

Typed Outcomes for Every Boundary

The same sum-type result pattern should apply to runtime selection, lease resolution, input validation, context shaping, inference execution, artifact write, and trace emission. Each boundary returns a closed set of variants, not a loose text error.

BoundaryResult =
  Accepted { value, metadata }
  | Degraded { value?, reason, diagnostics/ref? }
  | Rejected { code, reason, remediation? }
  | Deferred { operation/id }
  | Cancelled { reason }
  | Failed { class, retryable, diagnostics/ref? }

Rejection, retry, and degraded mode are part of the contract, not exceptions hidden in control flow.

Appendix: Lease Path and Artifact Store Hardening

Structural Lease Path Validation

Lease path validation should not be a regex over raw strings. A safer pipeline: canonical normalization, reject empty values, NUL, root paths, drive roots, and special segments, split into segments, then compare to an explicit list of allowed patterns. Wildcard, if allowed, should operate at segment level, not as arbitrary glob.

validate_lease_path(candidate, allowed_roots):
  path = normalize_absolute_path(candidate)
  reject if path is empty, root, contains NUL
  segments = split_path_segments(path)
  reject if segments contain "." or ".."
  return any(root_pattern_matches_segments(root, segments)
             for root in allowed_roots)

This style resists bypasses through //, .., ./, mixed separators, and encoded characters. Lease path validation failure is a terminal rejection, not a reason to expand roots dynamically.

Validate Identifiers Before Building Paths

Every external identifier used to build a path should be validated before it is joined with a base directory. This applies to artifact/id, dataset/id, model-artifact/id, lease/id, operation/id, and auxiliary names from user input or runtime output.

Minimum segment rules:

  • non-empty;
  • no /;
  • no \;
  • no NUL;
  • not .;
  • not ..;
  • bounded length;
  • stable character set or explicit sanitizer.

This is boundary validation, not cleanup after path construction.

Temporary Files as Contract, Not Trash

The artifact store temporary directory should have explicit rules: minimum file age before cleanup, no deletion of active operations, idempotent cleanup, and scope limited to the store-controlled directory. If cleanup races a write, the write path may recreate the directory and retry once, but it must not mask errors such as out-of-space.

This improves post-failure reasoning. After restart, the host can distinguish complete artifacts from temporary leftovers, and the operator can see what is safe to remove.

Implementation Tracking

This section is a lightweight, manually maintained tracker for implementation work derived from this proposal. Add rows as implementation tasks start or land. Rows marked done should point to concrete evidence such as code paths, schema fixtures, tests, operator surfaces, or migration notes.

Status values:

  • todo — not started,
  • in-progress — design or implementation has started,
  • done — implemented and covered by tests, schema validation, or documented operator evidence,
  • deferred — intentionally postponed.
ID Work item Status Done criteria / evidence
inq-runtime-catalog-v02 Move the lower model-runtime catalog to adapter implementations, adapter instances, model bindings, runtime candidates, runtime profiles, and conformance fixtures. done node/model-runtime contract v0.2 validates cross references and rejects missing adapter/model/conformance references.
inq-http-adapter-instance-handles Key HTTP lifecycle handles by adapter instance while invoking by selected runtime candidate. done node/model-runtime-http accepts RuntimeInvocationContext, supports one HTTP adapter instance serving multiple runtime candidates, and rejects caller override of host-owned model keys.
inq-command-stdio-invocation-context Apply the same host-built runtime invocation context to command-stdio adapter instances. done node/daemon merges runtime defaults, model binding parameters, and caller body before stdin serialization; caller override of model fails closed in daemon lifecycle coverage.
inq-daemon-runtime-routing Supervise adapter instances and route by runtime/ref in the daemon. done Daemon status separates healthy from routable, reports adapter/model binding refs, and counts only routable candidates. Focused daemon runtime tests pass sequentially.
inq-nse-use-runtime Make NSE choose runtime candidates instead of runtime/model pairs. done nse and nse-rhai use UseRuntime { runtime_id, reason }; Rhai scripts return decision: "use-runtime".
inq-python-remote-provider-adapters Add first middleware-hosted remote provider adapters while preserving adapter-instance/runtime-candidate stratification. done node/middleware-modules/inquirium-openai-adapter maps neutral Inquirium text generation to OpenAI Responses; node/middleware-modules/inquirium-anthropic-adapter maps the same contract to Anthropic Messages; both share node/middleware-modules/lib/inquirium_adapter, expose neutral and chat-compatible endpoints, read secrets from env/file config, and have fake-provider unit coverage. node/model-runtime-http also starts the OpenAI adapter as a managed process and verifies one adapter instance serving two runtime bindings.
inq-embedding-contracts Add explicit direct and batch embedding request/response contracts. done node/model-runtime exposes inquirium.embed.{request,response}.v1 and inquirium.batch-embed.{request,response}.v1 DTOs with validation for schema, operation, dimensions, leases, vector shape, and finite vector values.
inq-direct-data-plane Add durable direct data-plane leases, artifact output persistence, conformance report storage, and deferred long operations. in-progress Contract placeholders and routability hooks exist; daemon lease/artifact/conformance/deferred APIs remain to be implemented.