Thinklio: Decision Log¶

Status: Living Document Audience: Internal -- development, product, architecture

Purpose¶

This document records significant architectural and product decisions, their context, the options considered, and the reasoning behind the choice made. It serves as institutional memory -- when revisiting a decision, this log explains why it was made.

Each decision is numbered sequentially and immutable once recorded. Superseded decisions are marked but not deleted.

ADR-001: Platform Name¶

Date: 2026-03-13 Status: Accepted

Context: The platform needed a name that reflects its nature as an AI thinking/reasoning assistant while being distinctive, memorable, and available as a domain.

Decision: The platform is named Thinklio.

Reasoning: "Think" directly connects to the core function (agents that think on your behalf) and echoes the harness pattern (think, act, observe). The "-lio" suffix is modern and distinctive. It works for marketing, technical discussion, and avoids boxing the product into a narrow use case. Domains thinklio.com, thinklio.ai, and thinklio.io to be secured.

ADR-002: Agent Ownership Model -- Agents as First-Class Platform Entities¶

Date: 2026-03-13 Status: Accepted

Context: Three models were considered for how agents relate to the platform's organisational hierarchy:

Model A: Agents belong to organisations
Model B: Agents belong to teams
Model C: Agents are first-class platform entities, assigned to contexts

Decision: Model C -- agents are independent platform entities that can be created by any entity (user, team, org) and assigned to contexts via assignments.

Reasoning: Model C provides the most flexibility for the platform's long-term trajectory. A solo user can create a personal agent without needing an organisation. An organisation can create agents and assign them to teams. An agent can serve multiple contexts with isolated knowledge per context. This supports the widest range of deployment patterns without architectural changes. The complexity of the assignment model is manageable at the schema level.

Implications:

AgentAssignment table required (agent to context mapping with scope)
Knowledge scoping is per-assignment, not per-agent
Cost attribution flows through the assignment context
Templates enable standardised agent creation

ADR-003: Four-Layer Knowledge Model¶

Date: 2026-03-13 Status: Accepted

Context: Agents need access to different types of knowledge with different ownership, privacy, and lifecycle characteristics.

Decision: Knowledge is structured in four layers:

Agent knowledge -- intrinsic to the agent (skills, domain expertise, learned workflows)
Account knowledge -- curated by account admins (policies, procedures), mostly static
Team knowledge -- collective, grows organically from team interactions
User knowledge -- personal, private to the individual user

Reasoning: This model cleanly separates concerns: agent knowledge travels with the agent, account knowledge ensures governance, team knowledge compounds value from collaboration, user knowledge enables personalisation with privacy.

Precedence: Account > Agent > Team > User (for conflicts).

Privacy rules:

User knowledge is never visible to other users
Team knowledge is isolated between teams
User knowledge is portable (follows user on departure)
Team knowledge contributions stay with the team

ADR-004: Durable Execution Harness with Step-Level State Machine¶

Date: 2026-03-13 Status: Accepted

Context: Agent interactions involve multiple operations (context assembly, LLM calls, tool executions) that can fail independently. The system needs reliability, auditability, and cost accountability.

Decision: Every interaction runs inside a durable execution harness. Each operation is a "step" with an independent state machine: created, running, success or failed. Steps are persisted before execution and results persisted on completion. Failure reasons captured as metadata (timeout, error, governance, budget, cancellation, system). Interaction state derived from constituent steps.

Reasoning: Step-level durability provides crash recovery (resume from last completed step), no duplicate execution (idempotent resume), granular cost tracking (per-step), complete audit trail, and retry granularity (retry one step, not entire interaction).

ADR-005: Go for All Services¶

Date: 2026-03-13 Status: Superseded by ADR-017

Context: The original platform was designed as a distributed Go service architecture. Go was chosen for true concurrency, single static binary deployment, excellent operational tooling, and a stable ecosystem.

Decision: Go for all services from scratch.

Reasoning: Starting with Go everywhere avoids the complexity of maintaining two language ecosystems. For LLM integration, Go is adequate and the interfaces are well-defined HTTP/JSON APIs to external providers.

Superseded: The Convex-first greenfield rebuild (ADR-017) replaced Go services with Convex TypeScript server functions. The Go codebase is retained as archival context.

ADR-006: Event Bus via Redis Streams (Direct, No Separate Service)¶

Date: 2026-03-13 Status: Superseded by ADR-017

Context: Services needed an event bus for asynchronous communication.

Decision: Services publish to and consume from Redis Streams directly. No dedicated event bus service.

Reasoning: Redis Streams natively provide ordered, persistent, multi-consumer streams with consumer groups.

Superseded: Convex reactivity and triggers replace the Redis Streams event bus entirely. See ADR-017.

ADR-007: Single Redis Instance (Not Cluster) for Initial Deployment¶

Date: 2026-03-13 Status: Superseded by ADR-017

Context: Redis was the cache, event bus, and active job store.

Decision: Start with a single Redis instance. Move to cluster when scale demands it.

Superseded: Redis is eliminated from the stack entirely. Convex reactive caching and the Rate Limiter component replace all Redis functions. See ADR-017.

ADR-008: Deployment via Coolify on Hetzner¶

Date: 2026-03-13 Status: Superseded by ADR-017

Context: The Go service architecture deployed on a Hetzner VPS managed by Coolify.

Decision: Deploy on a Hetzner VPS, managed via Coolify.

Superseded: The hot path now runs on managed services (Convex Cloud, Clerk, Cloudflare R2) with no self-managed servers. Hetzner remains an option for enterprise on-premises deployments and the external queue worker (Tier 3). See ADR-017.

ADR-009: Agent Capability Levels as Policy Configuration¶

Date: 2026-03-13 Status: Accepted

Context: Agents need to support a progression of capabilities, from simple tool use to autonomous workflow learning.

Decision: Four capability levels defined as policy configuration: tools_only, workflow, experimental, learning.

Reasoning: The architecture (harness, governance, cost tracking) is the same at every level. What changes is the agent's configuration -- what it is allowed to do. Capability levels are policy decisions enforced at the harness level, not architectural differences. An agent can be promoted or demoted by changing a setting, without code changes.

Implications:

Learned workflows are first-class data objects (not just chat history)
Workflows can be reviewed, approved, and promoted across scopes
Higher capability levels require higher trust and stricter cost controls

ADR-010: No Calendar-Bound Timeline¶

Date: 2026-03-13 Status: Accepted

Context: Given single-developer, part-time development, fixed dates create artificial pressure.

Decision: Phases are sequential with defined success criteria, but not bound to calendar dates.

Reasoning: Each phase is complete when its success criteria are met. This allows sustainable pace, quality focus, and realistic progress tracking.

ADR-011: Platform Built from Scratch, Not Migrated from MyAI¶

Date: 2026-03-13 Status: Accepted

Context: The MyAI TypeScript prototype validated key concepts but its single-user architecture does not provide a useful starting point for a multi-tenant platform.

Decision: Thinklio is built from scratch as a new platform. MyAI is treated as an early experiment that informed the design, not as a codebase to migrate.

ADR-012: Job System for Deferred and Long-Running Work¶

Date: 2026-03-14 Status: Accepted

Context: Some work outlives a single interaction: long-running workflows, human handoffs, research tasks that take minutes or hours. The harness handles step-level execution, but cross-interaction coordination needs a separate system.

Decision: A dedicated job system with Job, Subjob, and JobObserver entities, and a state machine (pending, dispatched, in_progress, resolved, failed, cancelled, timed_out).

Reasoning: A dedicated job system cleanly separates concerns: the harness handles step-level execution; the job system handles cross-interaction coordination. The observer model decouples job creation from consumption, allowing multiple agents to watch the same job.

Implications:

Three execution modes for act steps: immediate, deferred, interactive
Deferred steps succeed on dispatch, not on work completion
Follow-up interactions triggered by job state changes
Context bundle carries forward state for follow-up interactions
Now modelled as Convex Workflow steps with waitForEvent

ADR-013: Agent-as-Tool Composition Model¶

Date: 2026-03-14 Status: Accepted

Context: Users need to build sophisticated agents by composing specialist capabilities. The question is how to model delegation.

Decision: Agents can be registered as tools with type agent. From the invoking agent's perspective, delegating to another agent is the same as calling any other tool: it goes through the same harness act step, policy evaluation, cost tracking, and audit trail.

Reasoning: Reuses the entire tool infrastructure (registry, policy engine, trust levels, rate limits, audit logging) without building a parallel system for delegation. The invocation contract (parameter_schema, return_schema) ensures well-defined interfaces.

Implications:

Tool entity gains type agent alongside internal and external
Delegation creates child interactions linked to the parent
Delegation depth and cycle detection enforced by the policy engine
Knowledge isolation preserved: delegate assembles its own context
Cost rolls up through the delegation chain

ADR-014: Three External API Surfaces¶

Date: 2026-03-14 Status: Accepted

Context: External systems interact with Thinklio in fundamentally different ways. A single API design cannot efficiently serve conversational access, orchestration, and bidirectional capability exchange.

Decision: Three distinct surfaces: Channel API (conversational access), Platform API (orchestration and management), Integration API (bidirectional capability exchange).

Reasoning: These represent different consumers, different interaction models, and different architectural implications. Designing all three as first-class concerns from the start prevents decisions that make one surface awkward to build later.

Implications:

All three surfaces resolve to the same authorisation model and governance framework
Each surface is versioned independently
Every architectural decision is evaluated against all three surfaces

ADR-015: Redis as Operational Store for Active Jobs¶

Date: 2026-03-14 Status: Superseded by ADR-017

Context: Active jobs need fast, frequent state reads and writes. Terminal jobs need durable storage for audit and reporting.

Decision: Active jobs in Redis, flushed to PostgreSQL on terminal state.

Superseded: With Redis eliminated, job state lives in Convex documents. The Workflow component provides the durable execution model. See ADR-017.

ADR-016: Per-Assignment Tool Restrictions¶

Date: 2026-03-14 Status: Accepted

Context: The same agent may be assigned to different contexts with different trust requirements. Tool access should be scoped per assignment.

Decision: The AgentAssignment entity includes a tool_restrictions field that can narrow (never widen) the agent's configured tool permissions for that specific assignment context.

Reasoning: Without per-assignment restrictions, tool access is binary: an agent either has a tool or it does not. Per-assignment restrictions allow a single agent to be shared across contexts with appropriately scoped capabilities. The "narrow only" rule is critical for security.

ADR-017: Convex-First Greenfield Rebuild¶

Date: 2026-04-06 Status: Accepted

Context: The Go service architecture (ADR-005 through ADR-008, ADR-015) was deployable and correct but carried structural weaknesses. The agent hot path crossed seven network hops. A six-tier caching strategy existed to mitigate latency but added its own complexity. Custom infrastructure replicated functionality available in purpose-built platforms. A two-plane migration proposal (Convex for the hot path, Go for governance) was analysed and superseded because maintaining two backend languages was expensive for a small team and the policy checks could be implemented as Convex middleware.

Decision: Greenfield rebuild on three managed services: Convex (Cloud, Ireland) for all application logic and data, Clerk for authentication and organisation management, Cloudflare R2 for file storage. Redis, Go services, Supabase, and the custom event bus are eliminated.

Reasoning: Convex provides reactive queries (eliminating custom caching), native vector and full-text search (eliminating pgvector), durable workflows (replacing the custom HarnessExecutor), real-time subscriptions (replacing Redis Streams), and a component ecosystem that handles most infrastructure concerns. Clerk provides pre-built auth UI and first-class Convex integration. The result is maximum performance, minimum moving parts, and a codebase that is straightforward for both humans and LLMs to reason about.

Supersedes: ADR-005, ADR-006, ADR-007, ADR-008, ADR-015.

Implications:

All application state lives in one Convex project (single source of truth)
Governance enforced as Convex custom function middleware (zero network overhead)
Three-tier execution model: interactive fast path, durable Workflow, external queue escape hatch (see ADR-018)
Audit records streamed via Fivetran CDC to external storage for long-term retention
Enterprise on-premises deployment uses the Convex open-source backend on Hetzner

ADR-018: Three-Tier Execution Model¶

Date: 2026-04-06 Status: Accepted

Context: The Convex Professional plan has a 100-slot ceiling on concurrent Workflow and Workpool executions. Interactive chat must never compete with background processing for execution capacity.

Decision: Three execution tiers. Tier 1 (interactive fast path): direct Convex mutations and actions with the Action Retrier wrapping LLM calls, zero Workflow slots consumed. Tier 2 (durable workflow): Convex Workflow component for work needing step-by-step journalling and crash recovery, slotted and budgeted. Tier 3 (external queue): Google Cloud Tasks or BullMQ on Hetzner for background volume that exceeds the slot ceiling, added only when monitoring shows it is needed.

Reasoning: Interactive work is sacred. A Workflow slot should only be consumed when the work genuinely requires durability. The external queue is an escape hatch, not a default. Each channel or job type starts Convex-native and migrates external only when volume demands it.

ADR-019: Clerk for Identity and Organisation Management¶

Date: 2026-04-06 Status: Accepted

Context: The Go architecture used Supabase Auth. The two-plane proposal proposed Keycloak. Neither provides pre-built UI or first-class Convex integration.

Decision: Clerk for all authentication and organisation management. Clerk organisations map to Thinklio accounts. Clerk roles and permissions map to the four-tier role model (owner, admin, editor, viewer).

Reasoning: Clerk provides pre-built auth UI (sign-up, sign-in, organisation management), first-class Convex integration via ConvexProviderWithClerk, OAuth and SSO support, organisation management with custom roles, and a smaller operational surface than either Supabase Auth or Keycloak.

ADR-020: Messaging-First Core Abstraction¶

Date: 2026-04-06 Status: Accepted

Context: The Go architecture treated channels as adapters at the edge. The Convex-first rebuild needed a core abstraction that unifies user-agent interaction, delegation, and team collaboration.

Decision: The core abstraction is messaging. A channel is a chat space with participants. Users and agents are both first-class participants in channels. A direct chat with an agent, a team room where an agent observes and contributes, and an organisation-wide agent service are all the same structural thing: messages in a channel.

Reasoning: This unifies the interaction model and eliminates the distinction between "chatting with an agent" and "using the platform." It naturally supports multi-party channels, agent-to-agent communication, and mixed human-agent collaboration without special-casing.

ADR-021: MCP Tool Permission Intersection in Shared Channels¶

Date: 2026-04-16 Status: Accepted

Context: In shared channels, an agent serves multiple users who may have different tool permissions. Without explicit rules, an agent could use tools on behalf of one user that another user in the channel has not been granted access to.

Decision: In any channel, an agent's effective tool set is the intersection of all human participants' tool permissions. If any participant lacks access to a tool, that tool becomes invisible to the agent in that channel.

Reasoning: This prevents privilege escalation via shared channels. The most-restrictive-wins rule ensures no participant is exposed to actions they have not been authorised for. Tool access is granted at three levels (organisation, team, user) with most-restrictive-wins resolution. Delegation chains preserve the intersection rule at every level.

ADR-022: Single Pooled Convex Deployment for Multi-Tenancy, with a Per-Tenant Graduation Path¶

Date: 2026-06-05 Status: Accepted

Context: Thinklio is multi-tenant (account → team → user, with budget and governance acting at each layer). Convex makes it trivial to run either one shared deployment for all tenants or a separate deployment per tenant, but provides no row-level security — tenant isolation in a shared deployment is entirely an application-code property. Expected scale is B2B: a handful of accounts initially, a few thousand small organisations at the ceiling. Enterprise customers may later require physical isolation, data residency, or their own Convex account.

Decision: Run all tenants in a single pooled Convex deployment with isolation enforced by JWT-derived accountId middleware (accountQuery / accountMutation), tenant-leading index discipline, and post-get ownership re-checks. Do not choose single-vs-separate globally; adopt the cell pattern — every tenant starts pooled and graduates per-tenant to a dedicated deployment (T2) or a customer-owned Convex account (T3) only when a contractual, compliance, residency, or scale trigger justifies it. The migration mechanism is per-tenant export + hard-delete, which is required for GDPR regardless.

Reasoning: A few thousand small orgs sit comfortably within one deployment; the binding constraint is document volume and hot-path read amplification, both controlled by index discipline, not the count of accounts. Pooling gives a small team instant onboarding (a Clerk webhook writes a row), one schema/migration/deploy, usage pooling, and easy cross-tenant features (shared agent catalog, platform analytics). Per-deployment-per-tenant would multiply operational overhead and require a control plane the long tail does not justify. The residual risks — an isolation bug with all-tenant blast radius, and noisy neighbours — are mitigated by enforcing the no-unscoped-ctx.db invariant via lint and by the existing per-account budget + Rate Limiter controls. Keeping the codebase deployment-agnostic and building per-tenant export/delete now preserves the option to isolate any tenant on demand, so the single-instance bet is never a trap. Full treatment in 15 Tenancy & Deployment Topology.

ADR-023: "Channel" Means Transport; the Chat Container Is "Chat" (World B)¶

Date: 2026-06-05 Status: Accepted

Context: The term "channel" was used for two incompatible concepts. The convex/schema.ts channels table meant the chat container (members + messages, Slack-like), while the design docs' channel_identity / user_channel tables meant the transport (Telegram, email). Code and docs had committed to opposite definitions, which made every cross-channel discussion ambiguous.

Decision: Reserve "channel" for the transport (web, mobile, Telegram, email, SMS, API) and name the interaction container a "chat". A reply-thread within a chat remains a "thread"; the code that bridges a channel is a "gateway / channel adapter". This requires renaming in code: channels → chats, channel_members → chat_members, messages.channelId → chatId, channels.ts → chats.ts, plus the web /messages surface. The existing channel_identity / user_channel design tables are now correctly named.

Reasoning: "Channel" gravitates to transport in ordinary speech and in the omnichannel market Thinklio targets ("we meet you on every channel"); the word should land on the thing said most often in product and marketing. "Chat" is an unambiguous container name that nests cleanly with the existing threadId (chat › thread › message). Choosing World B also harmonises the schema with the already-written transport-identity design rather than entrenching the contradiction. The rename cost is real but cheapest now (greenfield, solo, pre-scale). Full model in 16 Chats, Channels & Identity.

ADR-024: Chat as the Canonical, Channel-Agnostic Container; Channels Bind as Mirror, Relay, or Injection¶

Date: 2026-06-05 Status: Accepted

Context: A user may reach the same agent or group over multiple channels (web, mobile, Telegram), and content may also arrive from non-interactive sources (webhooks, inbound email). We needed to decide whether each channel is its own chat or a window onto a shared one, and how dissimilar channels (a live Telegram client vs a fire-and-forget webhook) fit one model.

Decision: The chat is canonical and channel-agnostic — the single source of truth. A channel is a window onto a chat, binding in one of three modes: Mirror (live, bidirectional, authenticated user — sees everything, full Convex reactivity; web, mobile, linked Telegram), Relay (asynchronous, shows only its own slice; email, SMS), or Injection (one-way inbound from a non-human source; webhooks, API events). A user's direct chat with an agent is a single chat across all its channels, so agent memory is per-chat with no separate cross-channel memory layer.

Reasoning: Convex's reactive queries make "same chat, seen everywhere" structural rather than something we synchronise by hand — once a message lands in the chat, every subscribed window updates live. Modelling channels as binding modes lets a live client and a webhook coexist over one canonical chat without special-casing, and collapses the earlier "split agent memory from the message log" problem entirely. Full model in 16 Chats, Channels & Identity.

ADR-025: Identity and Credentials Are Polymorphic over Principals (User, Account, Agent); Identities Are Bound by Verified Proof¶

Date: 2026-06-05 Status: Accepted

Context: Inbound external identities (a user's Telegram id, an agent's email address) and outbound credentials (a user's personal Cliniko key, an account's Notion token, an agent's own Google account) needed a coherent home. Storing identity inline on user_profiles does not scale, and credentials clearly belong to more than one kind of owner — including agents that act as themselves. External identities also need a secure binding mechanism; matching on a self-claimed handle would be spoofable.

Decision: Identity and credentials attach to a principal — user, agent, or account. Inbound identity lives in a polymorphic channel_identity table keyed by (principalType, principalId) with a relation of owns (user authors from it) or reachable_at (agent is addressed at it). Outbound secrets live in a credential table scoped user | account | agent | platform; the integration declares its required scope (via /meta) and Thinklio resolves the matching tier at call time. Agent identities and credentials attach at the assignment scope (user/team/account), distinguishing a per-user personal assistant from an account-wide agent. External identities are bound only by a verified one-time deep-link token (e.g. Telegram /start <token>), never by handle matching. Secrets are encrypted at rest and never returned to clients.

Reasoning: Making the principal explicit absorbs every case — the user's Telegram, the agent's email, the borrowed user key, the shared account token, and the agent's own Google account — with no special cases, and cleanly separates "borrow the user's identity" from "act as the agent's own identity". The verified-link flow proves both session ownership and control of the external account, closing the spoofing hole an allowlist would leave open. Full model in 16 Chats, Channels & Identity.

ADR-026: Universal Semantic Message; Compose Once, Render per Channel¶

Date: 2026-06-05 Status: Accepted

Context: An agent produces one logical reply, but channels differ enormously in rendering capability (rich interactive cards and token streaming on native apps; limited markup and inline keyboards on Telegram; an HTML/plain body on email; plain segments on SMS; structured JSON on the API). For the agent to remain channel-unaware, the reply cannot be authored per channel.

Decision: The agent emits a single channel-agnostic semantic message: markdown prose plus an optional array of typed affordance blocks (choices, action, form, file, citation, …), with the invariant that every affordance carries a text fallback, plus a streaming | final lifecycle. Prose is streamed by the LLM; affordances are emitted via structured affordance APIs/tools, not typed into the prose. The message is rendered by binding mode (ADR-024): Mirror (native) renders client-side from a UI component library, falling back to text; Relay/Injection render in the egress adapter at finalisation; the API serialises the canonical structure. A channel adapter contract — capabilityProfile, render(message, profile), normalizeInbound(event), and delivery semantics — is specified now and implemented per channel later. Post-delivery edits: canonical is the source of truth, Mirror reflects live, non-mirror is best-effort and never load-bearing. Delivery defaults to reply-in-kind; presence-aware notification fan-out is deferred.

Reasoning: Compose-once / render-per-channel is the only way to keep the agent channel-unaware while the chat stays canonical. The text-fallback invariant guarantees any renderer — present or future — can degrade without losing meaning (only interactivity). Treating capability as our data (the remote channel does no processing) makes every new channel a pure adapter with no change to the agent or chat model. Full treatment in 16 Chats, Channels & Identity §8.

ADR-027: Native-Only Chatal Mirror for v1; Non-Native I/O Is Relay/Injection¶

Date: 2026-06-05 Status: Accepted

Context: Third-party conversational channels (e.g. Telegram acting as a live window onto a chat) carry the entire hard half of delivery — capability profiles, degradation chains, bidirectional affordance mapping, post-delivery edit propagation. A worked example (a clinic's existing Telegram group and 1:1 chats) showed that native apps can replicate real team-chat structures directly using the chat type enum, member roles, and a role-keyed postPolicy.

Decision: For v1, all conversational participation is Mirror on native clients (web / Flutter / desktop). Non-native I/O (API, inbound email, webhooks) is Relay / Injection only — it posts into chats or triggers agents but never reproduces a chat UI. Third-party conversational Mirror channels are deferred; the channel adapter contract (ADR-026) is the hook to add them later without changing the agent or chat model. The posture is summarised as: "to use Thinklio conversationally, use our apps."

Reasoning: This defers the entire hard half of delivery with zero loss of optionality, and the chat model (organisation/team/group/direct + member roles + postPolicy) already replicates the team-chat tools customers use today. The only cost is app adoption versus an already-installed messenger like Telegram — a product/GTM lever, not an architecture constraint, and exactly the lever to pull (add Telegram as a Mirror channel) if app adoption proves to be the friction. Full treatment in 16 Chats, Channels & Identity §8.4.

New decisions are appended below this line.