Node Transport and Discovery MVP¶
Based on:
- doc/project/40-proposals/002-comm-protocol.md
- doc/project/30-stories/story-001.md
- doc/project/60-solutions/node.md
Status¶
Proposed (Draft)
Date¶
2026-03-23
Executive Summary¶
This proposal defines the smallest practical networking seed on which the first useful Orbiplex Node can be built.
The key decisions are:
- MVP networking addresses nodes, not user nyms.
- A Node needs a stable local node identity and a stable participant role identity before it can participate in the network without later role collapse.
- Discovery should map
node-id -> current signed endpoint advertisement, notnym -> IP:port. - The default transport should be WSS over TCP 443, with seed peers or a seed directory for bootstrap.
- Direct TCP and UDP/NAT traversal are later optimizations, not day-one dependencies.
- The first end-to-end protocol slice should be:
- identity load/generation,
- handshake,
- capability advertisement,
- keepalive,
- and one simple signed application message.
- The first useful signed application message should be
signal-marker. - The first seed directory should support both advertisement fetch and advertisement publication.
- MVP capability advertisement should expose only the narrow core, not attached roles or plugin-process surfaces as required semantics.
- MVP capability IDs should be schematic, with at least one required core
placeholder:
core/messaging
- In the MVP baseline, one Node has one operator-participant by default:
node-idnames the infrastructure role,participant-idnames the participation role,- both MAY share the same base
did:key, - but the protocol must still treat them as distinct roles.
Context and Problem Statement¶
Orbiplex already has richer protocol ideas:
- question envelopes,
- answer rooms,
- procurement,
- Whisper,
- archival handoff,
- and attached roles.
What is still missing for practical Node implementation is a minimal networking baseline that says:
- how a node identifies itself,
- how another node finds it,
- how a session is established,
- what the first shared message shape looks like,
- and what can be postponed without blocking the first working Node.
Without that seed, Node implementation stays trapped in architecture talk.
Goals¶
- Define the smallest credible Node networking baseline.
- Separate node identity from user or contextual nyms.
- Separate infrastructure role from participation role from day zero.
- Keep discovery simple enough for the first implementation.
- Make transport compatible with difficult real-world networks.
- Provide a practical implementation seed that can be mapped into the Node repo.
Non-Goals¶
- This proposal does not define full multi-user
podtenancy, root identity, or PoP. - This proposal does not define a global public nym registry.
- This proposal does not require DHT, libp2p, or generalized peer-to-peer routing.
- This proposal does not define final room semantics.
- This proposal does not define final federation governance or reputation policy.
Decision¶
Orbiplex should adopt the following networking MVP baseline:
- each Node has a locally generated long-lived keypair,
node-idis derived from the node public key as a strictnode:did:keyidentifier,participant-idis derived and persisted as a distinct role identifier even in the one-operator-per-node MVP,- transport baseline is
WSS/443, - bootstrap uses:
- statically configured seed peers as the mandatory first bootstrap layer,
- with a minimal seed directory only as an optional extension after that,
- discovery exchanges signed endpoint advertisements,
- peers establish a signed handshake and exchange capability advertisements,
- the first always-supported maintenance flow is
ping/pongand reconnect.
This means that full nym systems, advanced relay graphs, and rich room protocols can come later without blocking the first useful networked Node.
Proposed Model¶
1. Minimal node identity¶
Each Node should have:
- a local long-lived keypair,
- a stable
node-idderived from the public key, - a stable
participant-idrole identifier, - and a persisted local identity file.
For Ed25519 in v1, the canonical node-id format should be:
node:did:key:z<base58btc(0xed01 || raw_ed25519_public_key)>
The canonical participant-id format should be:
participant:did:key:z<base58btc(0xed01 || raw_ed25519_public_key)>
This means:
- the underlying public-key fingerprint uses the
did:keyEd25519 shape, z...is the base58btc multibase fingerprint,node:is the Orbiplex-specific technical identity prefix,- parsers should be strict,
- and alternative textual forms should not be accepted in v1.
That persisted identity record should expose only public identity material plus one local secret reference:
participant/idkey/storage-ref
For the MVP baseline, that reference should use:
local-file:identity/node-signing-key.v1.json
This keeps the identity record separate from the secret backend from day one and avoids a later migration away from inline private-key material.
This is enough for:
- peer addressing,
- role-aware participant attribution,
- signing handshake material,
- signing advertisements,
- and replay-resistant session establishment.
This MVP identity is not the same thing as:
- multi-user
pod-useridentity, - root identity,
- anchor identity,
custodian_ref,- or a public contextual nym.
Those are higher-layer concerns.
For the MVP baseline, node-id and participant-id MAY share the same base
did:key fingerprint and therefore the same signing key material. The protocol
must nevertheless treat them as distinct role identifiers. Shared key material
is an operational simplification for the one-operator-per-node baseline, not a
protocol guarantee that later runtimes may rely on.
1.1. Role split in the MVP baseline¶
The first networking baseline should already distinguish:
node-idfor infrastructure-facing actions,participant-idfor participation-facing actions.
In practice this means:
- endpoint advertisements, peer handshakes, keepalive, and other transport or
discovery artifacts are
node-id-scoped, - first application-level messages such as
signal-markerareparticipant-id-scoped, - reputation and sanctions may later attach to different roles even when the same operator currently uses both.
This split is intentional. The handshake answers "do I trust this infrastructure endpoint enough to establish a channel?", while application messages answer "do I trust this participant acting over that channel?". Those are different questions and should remain in different protocol layers.
This role split is normative even when both identifiers are backed by the same keypair in MVP.
For boundary discipline, the networking MVP should know exactly two protocol-level identity handles:
node-idfor infrastructure trust, addressing, advertisements, handshake, and keepalive,participant-idfor participant-scoped message authorship and participation reputation carried over the established channel.
It should not depend on higher identity layers such as:
anchor-identity,pod-user-id,- public or contextual
nym, - or federation-local continuity bindings.
Those remain application-layer concerns. If a later networking implementation starts importing those concepts as hard dependencies, the abstraction boundary is leaking rather than improving.
The same is true for nym handling:
- networking MUST NOT require
nymresolution, nymverification belongs to the application layer,- transport sees only
node:did:key:...andparticipant:did:key:..., - and
nymsignatures are verified above the established encrypted session.
1.2. Thin clients in MVP¶
MVP does not require a multi-user pod model.
If a Node exposes a remote UI or thin-client surface in the MVP baseline, that client should be treated as a delegated session or remote screen of the same operator-participant, not as a separate hosted participant with its own independent continuity layer.
1.3. Future participant bind over the established channel¶
Post-MVP multi-user or hosted-user participation should not widen
peer-handshake.v1 into a transport-plus-participant artifact.
Instead, a later layer may introduce a participant-scoped session bind carried
over the already established encrypted node↔node channel. That later artifact
may be named something like:
participant-session.v1, orparticipant-bind.v1
and should bind at least:
participant-id,via node-id,- a live session reference or channel context,
- and participant-side proof material.
This keeps transport authentication and participant authentication stratified:
peer-handshake.v1answers whether the infrastructure endpoint is trusted,- a later participant bind answers which participant is speaking through that endpoint.
In the MVP baseline this layer is implicit because participant-id and node-id
may share the same base did:key, and participant-scoped application messages can
therefore be verified directly without an extra bind artifact.
The first concrete schema seed for that later layer now lives in:
doc/schemas/participant-bind.v1.schema.json
The first concrete post-MVP consumer of that bind is expected to be the thin-client or hosted-user attachment artifact:
doc/schemas/client-instance-attachment.v1.schema.json
Its smallest lifecycle companion should be the matching detach artifact:
doc/schemas/client-instance-detachment.v1.schema.json
The next post-MVP lifecycle companion may recover client access after loss or migration through:
doc/schemas/client-instance-recovery.v1.schema.json
Rotation is also deferred as a richer operational layer. In v1:
- a new Ed25519 key means a new
node:did:keyand therefore a newnode-id, - overlap and succession proofs are an operational procedure rather than a live runtime feature,
- and persisted identities used by the MVP runtime should remain
active.
2. Discovery baseline¶
The first discovery target should be:
node-id -> current endpoint advertisement
not:
nym -> IP:port
The healthier MVP is:
- the Node knows one or more seed peers,
- or the Node knows one minimal seed directory endpoint,
- peers publish to and fetch from signed endpoint advertisements,
- advertisements expire through TTL and must be refreshed.
The implementation baseline should therefore begin with a local static seed list, not with a mandatory remote directory.
That seed list may carry, for local operator convenience:
- stable
node-id, - bootstrap address,
- and an optional human-friendly local label or self-chosen name.
Such labels are operational hints only. They are not part of node-id, are not
signed network identity, and may change without affecting protocol semantics.
For node-advertisement.v1, the signed surface should use a sign-then-attach
pattern:
- sign the whole semantic advertisement payload,
- exclude only the
signaturefield itself, - and exclude only transport-mutable per-hop metadata if a later wire wrapper carries such fields outside the semantic payload.
That payload should include at least:
advertisement/id,node/id,sequence/no,advertised-at,expires-at,- key material,
- endpoint set,
- and transport support claims.
The signing input for v1 should be:
node-advertisement.v1\x00 || deterministic_cbor(payload_without_signature)
This gives domain separation, deterministic verification, and malleability resistance. If a later wire framing carries raw advertisement bytes, receivers should verify those exact bytes rather than re-canonicalizing them.
The first byte-exact advertisement conformance vector now lives in:
doc/project/20-memos/networking-signing-conformance-vectors.md
For discovery state, v1 should treat advertisements as:
- one current active advertisement per
node-id, - with the highest seen
sequence/noreplacing older advertisements, - and stale or equal sequence numbers rejected as non-current.
That current-advertisement view is not just an optimization. For the MVP done state it should already exist as a local persistence cache:
WSSclient/server alone is not enough,- the Node should durably remember the freshest known advertisement per
node-id, - and outbound dialing or reconnect should reuse that cache rather than depend on live rediscovery every time.
This keeps one coherent transport view per node identity and avoids conflicting
capability or endpoint state for the same node-id.
To prepare the later rotation layer without turning it into a live MVP feature,
node-advertisement.v1 may also carry an optional succession object:
successor/node-id- optional proof slots for the current and successor identities
In the MVP baseline this remains a future-facing contract seed only:
- peers do not yet treat it as authoritative runtime continuity,
- it does not change advertisement freshness or replacement rules,
- and it exists mainly to stabilize field names before operational overlap and succession flows are implemented.
If a minimal seed directory is deployed, its first useful API should remain narrow:
PUT /adv/{node-id}:- publish or replace one signed advertisement if
sequence/nois newer, GET /adv/{node-id}:- fetch the current signed advertisement for one node,
GET /adv?since={cursor}:- fetch advertisements incrementally in small batches,
- no explicit delete endpoint:
- expiry is TTL-driven.
This keeps the directory as a signed cache rather than a trust gate. Read access should remain open, and write access should remain open to any correctly signed publisher, but rate-limited and freshness-checked.
Explicit full-dump directory listing should be avoided in the first design, because it turns the seed directory into a trivial topology-scraping endpoint.
Minimal request and response examples for that HTTP surface now live in:
doc/project/20-memos/seed-directory-http-examples.md
3. Transport baseline¶
The default transport should be:
WSSover TCP443
because it is:
- proxy-friendly,
- firewall-friendly,
- viable in mixed enterprise and consumer networks,
- already aligned with the broader communication proposal corpus.
Later transport layers may add:
- direct TCP,
- UDP hole punching,
- relay optimization,
- or more specialized transports.
But those should not block MVP.
For the MVP baseline, WSS should be treated as a carrier transport layer, not
as the primary identity protocol.
This means:
- TLS server authentication proves control of the advertised endpoint and gives baseline confidentiality and integrity for the carrier connection,
- Node-to-Node identity authentication still happens at the signed
peer-handshake.v1layer, - mutual TLS and client-certificate identity are not required for the MVP networking baseline,
- and reconnect should assume a fresh transport plus a fresh signed handshake, not protocol continuity derived from TLS session resumption.
For public wss:// endpoints, the client should validate the presented server
certificate against the hostname carried in the advertised endpoint URL using
normal WebPKI rules.
For controlled or private deployments, additional local trust roots may be configured out of band by the operator, but this trust configuration should stay deployment-local:
- it should not become protocol semantics,
- and
node-advertisement.v1should not carry TLS certificate blobs, pinsets, or custom trust material in the MVP baseline.
When multiple endpoints are present in one advertisement, the receiver should:
- filter out unsupported transports,
- prefer the sender-advertised endpoint priority among compatible endpoints,
- still allow local runtime constraints to override that hint.
In v1 this mostly reduces to ordered preference among wss endpoints. Wider
transport ranking can be added later when more than one transport family is
actually standardized.
4. Session baseline¶
The first session lifecycle should be:
- connect to seed or known peer over WSS,
- send signed
peer-handshake, - receive signed
peer-handshake-ack, - exchange
capability-advertisement, - maintain liveness with
ping/pong, - reconnect and refresh advertisement when needed.
For the MVP baseline, this should not mean maintaining a full mesh of permanent connections to all known peers. A healthier default is:
- keep a small active working set of long-lived sessions,
- target
2activehotpeer sessions as the normal default, - allow temporary probes or overlap up to
4live sessions, - keep other known peers in discovery state (
coldorwarm) until they are actually needed.
This gives redundancy without turning discovery into an always-connected flood of idle TCP/WSS sessions.
The MVP does not yet standardize a full peer-health or peer-governor policy,
but it should leave room for one. A later runtime layer may classify peers as
cold, warm, hot, or blocked based on observed session quality,
repeated failures, explicit local policy, and operator decisions.
For abuse control at this layer, rate limiting and backpressure should remain per-node rather than per-participant or per-nym. If one node emits too much traffic through many participant-scoped or nym-scoped application artifacts, peers should degrade the node-level relationship instead of forcing nym resolution into the transport boundary.
This baseline should use a two-message handshake:
hello -> ack
not:
hello -> challenge -> ack
The ack is sufficient as the response-binding step in v1, provided that it
cryptographically binds to the original initiation attempt.
For peer-handshake.v1, the signed surface should also use sign-then-attach:
- sign the semantic handshake payload,
- exclude only the
signaturefield itself, - and keep framing-only transport metadata outside the signed payload.
The signed payload should bind at least:
handshake/id,handshake/mode,sender/node-id,- optional
recipient/node-idwhen directed, ts,nonce,ack/of-handshake-idwhen present,- any offered capability claims,
- and any negotiated terms.
The signing input for v1 should be:
peer-handshake.v1\x00 || deterministic_cbor(payload_without_signature)
peer-handshake.v1 remains node-scoped in v1. It should authenticate the
serving infrastructure endpoint, not the later participant role speaking over
that encrypted channel.
For v1 session establishment, the long-lived Node identity remains the
node:did:key:... Ed25519 signing identity. The corresponding static X25519
key-agreement public key may be deterministically derived from that identity key
for the DH terms that bind long-lived identity to the session attempt.
This means MVP does not need any additional static X25519 field in
node-identity.v1 or node-advertisement.v1.
It also means peer-handshake.v1 does not need to carry participant-id.
Participant authentication belongs at the application-message layer over the
already established node-to-node session, not inside the transport-session
handshake itself.
The same boundary applies to higher identity layers:
peer-handshake.v1should not depend onanchor-identity,- it should not depend on
pod-user-id, - and it should not route or negotiate on
nym.
Those concepts may later appear in application payloads or participant-bound artifacts, but they are outside the MVP transport-session contract.
peer-handshake.v1 SHOULD instead carry:
- a fresh per-handshake ephemeral X25519 public key in
session/pub, - encoded as raw unpadded base64url for the 32-byte X25519 public key,
- without
did:key, multicodec prefixes, or identity wrappers.
This keeps the layers distinct:
- identity layer:
node:did:key:...-> Ed25519 signing identity, with static X25519 derivable for DH,- session layer:
session/pub-> fresh ephemeral X25519 public key for the current handshake.
For the Noise-IK-like v1 baseline, the intended session derivation uses:
DH1 = X25519(EK_initiator, IK_responder_derived_x25519)DH2 = X25519(IK_initiator_derived_x25519, EK_responder)DH3 = X25519(EK_initiator, EK_responder)
Forward secrecy in v1 relies primarily on fresh ephemeral keys and the DH3
term, not on the long-lived identity-derived DH terms alone.
This is accepted as a pragmatic MVP compromise. It does not provide ideal key separation between signing and key agreement, and may later evolve into a richer continuity or rotation model.
ack/of-handshake-id must be signed, because it cryptographically binds the
response to one concrete initiation attempt. Protocol version belongs in the
interpretation context and domain separator, not in the core semantic payload.
Per-hop transport profile should remain framing metadata unless it is being
asserted as a capability claim.
In v1 the handshake family should stay symmetric at schema level:
- the same
peer-handshake.v1family covers bothhelloandack, ack/of-handshake-idis the binding field that turns the second message into the acknowledgment of the first,- and the existing
handshake/modefield may remain as an explicit discriminator for clarity and easier interoperability, even though the semantic asymmetry is already visible throughack/of-handshake-id.
The first positive ack fixture and byte-exact handshake signing vector now live
close to the schema layer:
doc/schemas/examples/bootstrap.ack.peer-handshake.jsondoc/project/20-memos/networking-signing-conformance-vectors.md
Replay resistance for the baseline handshake should be defined by:
- a clock-skew window of
+-30s, - a per-peer nonce cache with retention of roughly
120s, - and a local pending-handshake timeout of
30sfor outstandinghelloattempts.
This baseline is aligned with a Noise-IK-style two-message authenticated exchange, but the current MVP still treats transport framing and later session crypto details as separable from the semantic handshake contract.
Within that boundary, ping/pong is more than a transport convenience. In the
MVP it remains only a liveness primitive, but it is already the seed of a later
peer governor: the same signals can later feed health tracking, degradation,
promotion, reconnect policy, and eventual cold/warm/hot/blocked transitions.
Even though full peer-governor semantics remain deferred, the MVP should already expose the smallest peer-state transition traces:
peer/discoveredpeer/connectedpeer/disconnected
Without those, an operator cannot tell whether the node is isolated, flapping, or gradually building a live peer set.
The capability advertisement should not try to describe the whole implementation. It should publish a small set of schematic core capability identifiers, such as:
- required minimum:
core/messaging- optional additions:
core/discoverycore/relaycore/keepalive
Capabilities that are already implicit in the successful execution of the handshake, such as "can sign the handshake" or "is a protocol participant", should not be modeled as mandatory advertised core capabilities in v1.
5. First useful message slice¶
After handshake and capability exchange, the first application-level slice should be very small.
The first useful signed message should be:
- one simple signed
signal-markermessage scoped toparticipant-id.
The point is that Node can:
- sign it,
- validate it,
- send it,
- receive it,
- and trace it end to end.
Higher-layer publication such as reduced question-envelope may follow later, once
the seed transport and session slice is already proven end to end.
At this stage the proposal freezes the semantic signal-marker decision, not
and now also freezes the canonical participant-scoped publication envelope for
that marker. signal-marker-envelope.v1 is the minimal signed wire artifact,
while the separately visible signal-marker.v1 marker remains the semantic or
UI-facing artifact it may reference.
Operational and Implementation Boundaries¶
Required done state¶
For the networking MVP, required done state is broader than "a WSS client and server can talk".
It already includes:
- persisted local identity with import/export validation,
- advertisement persistence cache with monotonic replacement,
- signed advertisement publish/fetch,
- signed handshake plus encrypted session establishment,
- capability exchange,
- keepalive and reconnect,
- and the first participant-scoped
signal-markersend/receive slice.
Minimal trace surface¶
The MVP should freeze one small but explicit network trace vocabulary.
Required event families:
identity/loadedidentity/generatedadvertisement/publishedadvertisement/fetchedadvertisement/rejectedpeer/discoveredpeer/connectedpeer/disconnectedhandshake/startedhandshake/acceptedhandshake/rejectedsession/establishedcapability/exchangedkeepalive/lostkeepalive/restoredsignal-marker/sentsignal-marker/receivederror/occurred
Each event should share at least one bounded skeleton:
{
"trace/id": "tr:01JV...",
"event/type": "handshake/rejected",
"ts": "2026-03-28T14:12:00Z",
"node/self": "node:did:key:z6MkA...",
"peer": "node:did:key:z6MkB...",
"detail": {
"code": "E_HS_REPLAY"
}
}
node/self is part of the contract because operators may aggregate traces from
many nodes.
Frozen MVP boundary error classes¶
The networking MVP should freeze eleven stable machine-readable boundary codes:
E_TRANSPORT_UNAVAILABLEE_TRANSPORT_REJECTEDE_ADV_EXPIREDE_ADV_STALE_SEQE_SIG_INVALIDE_SIG_UNKNOWN_SIGNERE_HS_REPLAYE_HS_UNKNOWN_REFE_PROTO_VERSIONE_PROTO_INVALIDE_PROTO_CAP_MISSING
Those codes group into the following classes:
transport/unavailabletransport/rejectedadvertisement/expiredadvertisement/stale-sequencesignature/invalidsignature/unknown-signerhandshake/replay-suspectedhandshake/unknown-refprotocol/version-mismatchprotocol/invalid-contractprotocol/capability-missing
The class and code are for machines; context and peer details are for operators. Those concerns should stay separate.
Schema-gate boundary¶
schema-gate should validate not only JSON ingress/egress received from the
network, but also identity-file import/export as a required runtime boundary.
That means:
node-identity.v1import and export already belong in the required MVP gate,- network ingress/egress still covers
node-advertisement.v1,peer-handshake.v1, andcapability-advertisement.v1, - and post-networking participant binding may widen that gate later without collapsing transport and application layers.
MVP Seed Checklist¶
- Define local Node identity file format.
- Freeze
node-idderivation rule as strictnode:did:key. - Define signed endpoint advertisement shape.
- Define peer handshake shape.
- Define capability advertisement shape.
- Define keepalive and reconnect behavior.
- Define seed peer / seed directory configuration.
- Implement load-or-generate local identity.
- Implement WSS bootstrap connection to a seed peer.
- Implement signed handshake and handshake acknowledgment.
- Implement capability advertisement exchange.
- Implement
ping/pong. - Implement
signal-markerend to end as the first signed application message. - Add required network traces and diagnostics for:
- identity load/generation,
- advertisement publish/fetch/reject,
- peer discovered/connect/disconnect,
- handshake start/accept/reject,
- session established,
- capability exchange,
- keepalive lost/restored,
- signal-marker send/receive,
- and
error/occurred.
- Freeze the eleven MVP boundary error classes and their stable wire-visible codes.
- Treat advertisement persistence cache and identity import/export schema validation as part of required done state, not later polish.
Document Seed Needed Next¶
The next minimal schema set should likely be:
node-identity.v1node-advertisement.v1peer-handshake.v1capability-advertisement.v1
Optionally later:
peer-status.v1
This later layer may formalize peer-health tracking and governor-oriented
transitions such as cold -> warm -> hot -> blocked, but those semantics are
intentionally deferred beyond the MVP baseline.
Trade-offs¶
- WSS/443 vs custom raw transport:
- Benefit: works sooner in real networks.
- Cost: less transport purity and some framing overhead.
- Seed peers/directory vs global decentralized discovery:
- Benefit: much lower bootstrap cost.
- Cost: less ideological purity in the first version.
- Node identity first vs full nym model first:
- Benefit: practical implementability.
- Cost: higher-layer identity remains deferred.
- Minimal handshake first vs complete protocol family first:
- Benefit: one working slice sooner.
- Cost: more protocol surface still pending.
Remaining Open Questions¶
- Should MVP implementation stop at static seed peers first, or also include a minimal read-write seed directory?
Next Actions¶
- Add one Node requirement for networking baseline.
- Add the first schema quartet:
node-identity.v1node-advertisement.v1peer-handshake.v1capability-advertisement.v1- Extend
doc/project/60-solutions/node.mdwith the networking baseline as a first-class Node capability. - Extend the Node implementation ledger with the same baseline capability.
- Add the frozen trace-event and boundary-error contract to the first implementation slice.