Changes
All discovered entries under openspec/changes/. Sorted by status, then by most recently modified.
ship-adaptive-collection-rate-controller proved the SLVP-ideal adaptive rate controller LIVE on ChatGPT (19 → 32.7 conv/min): slow-start discovery, AIMD accelerate-under-success, a single owner-authored rate ceiling it never probes across, warm-start that compounds the learned rate across runs, and an operator- legible collection_rate readout.
A trusted owner agent (Daisy/Simon-style) can already initiate a local-collector connection through POST /v1/owner/connections/intents, but a browser-bound connector such as a second Amazon account returns unsupported. The reason is honest: the reference has no enrollment primitive that lets a local collector drive a real browser session and ingest through the device-exporter path. The enroll route hardcodes sourcekind: "localdevice" and does no binding-aware validation, so there is no way to record that a collected binding is browser-collected rather than filesystem-read.
Railway has a proven one-click Core deploy and Fly.io has a proven one-command launch, but the Docker path still demands a repository clone, a large env file, and the development/owner compose stack — the owner's words: "right now I am overwhelmed looking at the link for docker." The standalone Core image already contains everything needed; what is missing is two small image gaps (no localhost origin default, no first-boot owner credential) and a deliberately small user-facing surface (one docker run quickstart, one minimal production compose) per design-notes/deploy-surface-parity-2026-06-10.md and docs/research/deploy-button-parity-prior-art-2026-06-10.md.
Owners expect "Google Maps" in Add source to mean a live Google-account authorization flow when Google exposes one. The current Google Maps work only imports owner-provided Timeline files, and presenting that as a Gmail-like connection would be dishonest.
SUPERSEDED (2026-06-10) by converge-provider-rate-governance. This > change's rate-governance axes (per-provider pacing, ratio-based retry budget, > circuit breaker, run-budget envelope, detail-gap drain loop) landed and are > absorbed by the convergence change, which corrects the layer-ownership model: > a provider request path has exactly ONE pre-flight send governor (the AIMD > concurrency lane), and GCRA pacing is a signal folded into it, not a second > independent pre-flight gate. To avoid two active changes both adding the same > polyfill-runtime requirements, only converge-provider-rate-governance > carries the rate-governance deltas to archive; this change is parked. Its > still-independent work — commit-gated/opaque-cursor checkpoint durability > (§2.5) and catch-up vs. steady-state bookmark separation (§2.6) — is NOT > absorbed and should move to a dedicated cursor-durability change if pursued. > See converge-provider-rate-governance/design.md ("Disposition of > add-provider-budget-run-control").
The chase/statements and usaa/statements retained histories churn on every re-download even when the owner-visible statement is unchanged. The statement PDFs are content-addressed by pdfsha256 = sha256(raw bytes), but the raw bytes are not the content: Chase statement PDFs are RC4-encrypted and the source regenerates the per-download encryption key material and embedded generation timestamps on every fetch, so pdfsha256 (and the pdfpath/documenturl that embed it) moves with zero change to the decrypted text or page count. Read-only evidence (tmp/workstreams/ri-version-rationality-evidence-v1-report.md) proved the decrypted text sha and page count are invariant across this churn for every comparable Chase blob pair, and that USAA's own PDF-derived transactions are byte-identical content across the same pdf_sha256 churn.
The release train ran two channels: publishable work landed on main, but semantic-release published only when an owner advanced the beta branch and pushed it, cutting 0.1.0-beta.N prereleases to npm's beta dist-tag. The beta branch was a second moving part with no countervailing benefit:
Connector setup is still fragmented across local collector enrollment, browser collector proof gates, static-secret draft/capture routes, console catalog copy, and owner-agent intent responses. A self-hosted operator, including a Railway operator, should not need connector-specific per-connection environment variables or runbook archaeology to add supported connections.
The current hosted MCP surface exposes every read and event-subscription tool in one flat tools/list response. The first footprint tranche removes duplicated prose, but it intentionally preserves the 14-tool topology. That is still too broad for the normal read/query setup path.
Two confirmed vulnerabilities (program audit wave 2, S-1 + S-2) let an internet-facing reference deployment expose its owner control plane to anyone:
The Postgres semantic-search path stores embeddings in semanticsearchblob.embedding as JSONB (384-dim float arrays, roughly 4.8 KB/row versus roughly 1.5 KB as a pgvector vector) and answers queries by SELECTing candidate rows and brute-force cosine-scoring them in JavaScript (postgres-search.js postgresSemanticSearch). The live deployment already runs the pgvector/pgvector:pg16 image, so the vector extension is available but unused. At the live table size (~1.85M rows / ~10 GB) the JSONB representation wastes roughly 3× the storage and the brute-force read path ships every candidate embedding over the wire to score it in JS — worse, the candidate SELECT carries a bare LIMIT with no ordering, so on scopes larger than the per-connector overscan the JS pass scores an arbitrary candidate subset rather than the true nearest neighbors.
The PDPP grant model exposes a single_use access mode that is central to the protocol's safety story — it bounds the blast radius of a grant to one retrieval session and prevents silent reuse. The reference implementation has enforced this atomically since the initial grant implementation, but the enforcement had no HTTP-boundary proof:
@pdpp/mcp-server is the canonical MCP adapter for grant-scoped PDPP reads. It is advertised as an npx -y @pdpp/mcp-server command in:
The June-6 image-slimming change made the default reference image browser-free (the browsers Dockerfile stage was retained but its output was never wired into CI publication). Deployments that run browser-backed connectors (ChatGPT, USAA, ...) inside the reference container hit a silent hard failure at Patchright launch: "Executable doesn't exist at /opt/patchright-browsers/...". The fix took four days to diagnose because the image advertised no build-time signal that browsers were absent.
The hosted MCP surface currently repeats the same cross-cutting guidance across many tool descriptions, producing a ~49.6 KB tools/list payload for 14 tools. That cost is paid by every MCP client session and sits in an unpredictable zone for chat-hosted clients whose exact tool-description and tool-result budgets are host-defined.
@pdpp/remote-surface is the extracted streaming and control substrate (geometry, pointer mapping, mobile IME, clipboard policy, n.eko/CDP adapters, diagnostics, leases, testing fixtures). It is reusable infrastructure for any remote-browser surface, and PDPP is one consumer among many we expect.
The ChatGPT connector's adaptive collection rate controller is dressed in AIMD machinery that a hand-tuned floor disables. converge-provider-rate-governance correctly collapsed the two pre-flight waits into one (the AdaptiveLane is the sole send governor; GCRA pacing rides as a launchDelayHint), but three incidental constants and one cap-era policy still defeat the loop:
Google Maps Timeline, WhatsApp exports, Apple Health exports, Takeout archives, device media folders, and similar sources expose a stale gap in the Collection Profile: collection can arrive through multiple acquisition methods, often partially and out of order, while still populating the same logical streams.
Grouped aggregate responses (groupby and groupby_time) return the top-N groups ordered by count. Without a rollup field, callers cannot tell whether the top-N is a complete picture of the data or a truncated subset. A model or agent that trusts a limited facet list without knowing the tail size can draw incorrect conclusions (e.g. "these are all the senders" when 40% of records fell into groups beyond the limit). A second aggregation call to compute the tail is extra latency and an unnecessary round trip.
The connection lifecycle primitives revokeconnection (stop future collection, preserve records) and deleteconnection (erase exactly one connection's source-of-truth records/state) are shipped and audited — but only over the owner-agent bearer REST control plane (POST /v1/owner/connections/:id/revoke, DELETE /v1/owner/connections/:id). The operator console exposes neither. The records-list no-data copy directs the operator to "ask your owner agent to revoke it … or delete it."
Google Maps Timeline is a high-value owner data source, but the existing Google Takeout connector exposes location history only as one stream inside a broad archive connector. Current Google guidance also makes Timeline export a device/app file flow, so a browser scraper would be the wrong first implementation.
Headless or sandboxed MCP clients can fail the normal loopback OAuth callback flow by opening a browser the user cannot operate and then waiting indefinitely. Prior-art review shows that the SLVP path for browserless setup is an explicit, bounded device-authorization flow, but PDPP's existing RFC 8628 endpoint currently issues owner-agent credentials and /mcp must reject owner bearers.
The 2026-06-09 incident: four connections were migrated env→store and verified green via MANUAL runs, but the scheduler's launch path (runtime/scheduler.ts::launchRun → runConnector) never consulted the encrypted per-connection credential store — only controller.runNow resolved staticSecretEnv. When the reference container was recreated without the old secret exports, compose ${VAR:-} mappings left the credential env vars as EMPTY STRINGS, and every scheduled static-secret run raised credentialsrequired ("github needs: GITHUBP..."), auto-cancelled, and reported connectorreportedfailed while valid store rows sat unread.
Google Maps now has both a file/import path and an API-backed Data Portability path, and agent hosts still vary in how much MCP tool output they expose. The reference needs a durable rule for reusing stream definitions across acquisition paths without erasing source identity, and MCP search needs to put a usable fetch handle where clipped previews still show it.
A live ChatGPT retest of the 5-tool MCP surface (2026-06-09) showed the search→fetch journey failing on multi-source hosted packages. Search hits carried id = stream:recordid plus a SEPARATE connectionid field, so a model had to carry TWO values between tools. fetch(id) without connectionid returned a typed 409 ambiguousconnection; ChatGPT's rendered envelope buried the second field and its model never completed a fetch (retrying with both fields was verified to work). OpenAI's search/fetch contract treats result ids as single opaque handles; ours leaked a join requirement into the model loop.
Owner-agent REST callers can discover connectionid values from /v1/streams, but the polyfill owner read path still requires connectorid on stream record reads. This makes the public REST contract weaker than MCP and weaker than the advertised connection_id query shape.
The consent surface advertises a three-class trust model — protocol-enforced facts, manifest-authored descriptions, and client-authored claims — and the operator-ui consent card already renders each class distinctly (every element carries a data-authorship provenance hook). But the reference Authorization Server's hosted consent renderer (reference-implementation/server/routes/as-consent-ui-helpers.ts) did not honor that boundary:
The vanished-run diagnosis (tmp/workstreams/vanished-run-diagnosis-2026-06-10.md, run run1781118340000) showed that a run which started, persisted both lifecycle events, and failed terminally 452 ms after launch still looked vanished to its observer. The contract break: the runid returned by the run-now 202 ack stops being resolvable the moment the run settles.