Foundational Chat Agent¶

Thinklio Built-in Agent Specification Version 0.1 | March 2026

1. Purpose¶

The Foundational Chat Agent is the first agent deployed on the Thinklio platform. Its primary purpose is not product value — it is platform validation. It exercises the minimum set of platform capabilities needed to prove the system works end to end: a user can send a message, the platform assembles context from the knowledge layers, a call is made to Claude Sonnet 4.6, and the response is returned.

It is also genuinely useful. A capable general-purpose assistant with access to the user's knowledge, their team's shared context, and their organisation's information is valuable from day one — even before any specialist agents exist.

Everything in this spec is deliberately minimal. Where there is a choice between doing something properly and doing something simply, this spec chooses simply. The goal is a working, testable agent, not a complete one.

2. Scope¶

In scope: - Single-turn and multi-turn conversation - Context assembly from three knowledge layers: user, team, organisation - Knowledge extraction from conversations (writing facts back to the appropriate layer) - A single channel: web chat (the Thinklio native interface) - Claude Sonnet 4.6 as the underlying model

Out of scope for this agent: - Tools and external system access - Agent-to-agent delegation - Job system and deferred execution - Policy engine and capability level enforcement - Budget tracking and cost controls - Vector/semantic search (plain text retrieval is sufficient for initial testing) - Multiple simultaneous agents

3. UI Structure¶

[ Chat ]

Single tab. The chat interface is the entire product at this stage.

The interface shows: - Conversation history for the current session - A message input field - A send button - An indicator when the agent is processing

No additional UI elements are required for the foundational agent. Settings, knowledge management, and analytics are admin-only and accessed separately.

4. Knowledge Layers¶

The foundational agent has read and write access to three knowledge layers. Each layer is scoped to a specific context and has different visibility rules.

4.1 User Knowledge¶

Scope: Private to the individual user. Not visible to other users, team admins, or org admins.

What it contains: Personal preferences, working style, ongoing projects, frequently referenced facts, and anything the user has told the agent about themselves.

Examples: - "Prefers concise responses" - "Currently working on the Thinklio pricing model" - "Based in Perth, Australia (UTC+8)" - "Uses Go as primary backend language"

How it is populated: - Extracted automatically from conversations - Manually added by the user via a future settings interface (not required for the foundational agent)

4.2 Team Knowledge¶

Scope: Shared across all members of a team. Visible to all team members. Writable by any team member (extractions are attributed to the contributing user).

What it contains: Shared project context, team conventions, commonly referenced information, and facts that are relevant to the whole team.

Examples: - "The team's primary deployment platform is Hetzner Cloud, managed via Coolify" - "Go monorepo structure with services per subdomain" - "Primary frontend stack is Next.js"

How it is populated: - Extracted automatically when the agent determines a fact is team-relevant (not personal, not org-wide) - Manually added by team members or admins

4.3 Organisation Knowledge¶

Scope: Shared across all users in the organisation. Typically set by org admins. Less frequently written than user or team knowledge.

What it contains: Company-wide policies, procedures, reference information, and facts that apply across all teams.

Examples: - "The organisation is Novansa Pty Ltd, operating under Australian law" - "Primary payment processor is Paddle" - "The organisation's products include Thinklio, Clindice, Couple Tools, and Calmerflow"

How it is populated: - Extracted automatically when the agent determines a fact is org-wide - Primarily managed by org admins

5. Context Assembly¶

Before every LLM call, the platform assembles a context bundle from the three knowledge layers. This bundle is injected into the system prompt, giving the agent relevant background without the user needing to repeat themselves.

5.1 Assembly Process¶

1. Retrieve relevant facts from user knowledge layer
   → Match against current conversation topic (text search, v1)
   → Cap at N facts (configurable, default: 20)

2. Retrieve relevant facts from team knowledge layer
   → Same matching process
   → Cap at M facts (configurable, default: 15)

3. Retrieve relevant facts from org knowledge layer
   → Same matching process
   → Cap at K facts (configurable, default: 10)

4. Assemble system prompt:
   → Base system prompt (agent persona and instructions)
   → Org knowledge block (if any facts retrieved)
   → Team knowledge block (if any facts retrieved)
   → User knowledge block (if any facts retrieved)
   → Conversation history (recent turns, within token budget)

5. Call Claude Sonnet 4.6 with assembled prompt + current message

5.2 Token Budget¶

The assembled context must fit within the model's context window alongside the conversation and leave sufficient room for the response. For the foundational agent:

Total context budget: 180,000 tokens (well within Sonnet 4.6's window)
Knowledge facts: up to 4,000 tokens combined across all three layers
Conversation history: up to 20,000 tokens (approximately 15–20 turns)
System prompt base: approximately 500 tokens
Response budget: 4,096 tokens

For v1, if retrieved facts exceed the budget, truncate by layer priority: user knowledge is preserved first, then team, then org. Within each layer, more recently used facts are preferred.

5.3 System Prompt Structure¶

You are a capable, knowledgeable assistant deployed on the Thinklio platform.
You have access to context about the user, their team, and their organisation.
Use this context to give relevant, informed responses. Do not repeat facts back
to the user unnecessarily — use them to inform your responses naturally.

[ORG CONTEXT]
The following facts are known about the user's organisation:
{org_facts}

[TEAM CONTEXT]
The following facts are known about the user's team:
{team_facts}

[USER CONTEXT]
The following facts are known about this user:
{user_facts}

Empty blocks are omitted if no relevant facts were retrieved for that layer.

6. Knowledge Extraction¶

After each interaction, the platform analyses the conversation to extract facts worth retaining. This is what makes the agent progressively more useful over time.

6.1 Extraction Process¶

1. At interaction end (or periodically during long conversations):

2. Submit the conversation to a lightweight extraction prompt:
   "Review this conversation and extract discrete facts worth retaining.
    For each fact, classify it as:
    - user: personal to this individual
    - team: relevant to the whole team
    - org: relevant to the whole organisation
    - discard: transient, not worth retaining
    Return a structured list."

3. For each extracted fact:
   → Deduplicate against existing knowledge (does this fact already exist?)
   → If new: write to the appropriate knowledge layer
   → If conflicting: update the existing fact, retain the old version with a timestamp

4. Log extraction results against the interaction record

6.2 Extraction Quality¶

For the foundational agent, extraction quality does not need to be perfect. False positives (storing something that turns out not to be useful) are preferable to false negatives (missing something important). The user and admin can review and clean up the knowledge layers via a future management interface.

What must be avoided: extracting and storing sensitive information (passwords, financial credentials, personal health information). The extraction prompt should include explicit instructions to skip these.

6.3 Extraction Scope Classification¶

The extraction model classifies each fact into user/team/org/discard using the following heuristics:

Classification	Heuristic
`user`	Personal preferences, individual context, private information, things prefixed with "I" or "my"
`team`	Technical decisions, shared tooling, project context that multiple people would benefit from
`org`	Company-wide policies, business information, facts about the organisation itself
`discard`	Transient facts, questions asked (not answered), opinions without factual content

7. Interaction Model¶

7.1 Interaction Lifecycle¶

Each user message initiates an interaction. For the foundational agent, the interaction lifecycle is simple:

User sends message
    ↓
Create Interaction record (status: in_progress)
    ↓
Assemble context (retrieve knowledge facts)
    ↓
Create Step record (type: think, status: running)
    ↓
Call Claude Sonnet 4.6
    ↓
Update Step record (status: complete)
    ↓
Create Step record (type: respond, status: running)
    ↓
Stream response to user
    ↓
Update Step record (status: complete)
    ↓
Update Interaction record (status: complete)
    ↓
Run knowledge extraction (async, does not block response)

7.2 Conversation History¶

The conversation history is maintained per session. A session begins when the user opens a new conversation and ends when they close it or after a configurable inactivity timeout (default: 60 minutes).

Within a session, all prior turns are included in the context up to the token budget. Across sessions, the agent relies on the knowledge layers for continuity — it does not retain conversation history between sessions in v1.

7.3 Streaming¶

Responses are streamed to the client as they are generated. The UI should render tokens as they arrive rather than waiting for the complete response. This is important for perceived responsiveness, particularly for longer responses.

8. Configuration¶

8.1 Platform Configuration (set at deployment)¶

Setting	Default	Description
Model	`claude-sonnet-4-6`	Underlying LLM — not user-configurable
Max response tokens	4,096	Maximum response length
Knowledge fact caps	20 / 15 / 10	Max facts retrieved per layer (user / team / org)
Session timeout	60 minutes	Inactivity period before session ends
Extraction enabled	true	Whether knowledge extraction runs after interactions
Extraction async	true	Whether extraction runs asynchronously (recommended)

8.2 User Configuration¶

For the foundational agent, user configuration is minimal and accessed via a simple settings panel:

Setting	Default	Description
Response style	Standard	Concise / standard / detailed — appended to system prompt
Language	English	Preferred response language

8.3 Admin Configuration¶

Setting	Default	Description
Org knowledge visibility	All members	Whether org knowledge is visible in responses
Team knowledge visibility	Team members	Whether team knowledge is visible to the team
Extraction classification	Auto	Whether fact classification is automatic or requires admin review
Knowledge review required	false	Whether extracted facts require admin approval before use

9. Data Objects¶

The foundational agent introduces or uses the following core data objects. These are deliberately minimal — more fields can be added as the platform matures.

User¶

User
├── id              UUID
├── display_name    string
├── email           string
└── created_at      timestamp

Organisation¶

Organisation
├── id              UUID
├── name            string
└── created_at      timestamp

Team¶

Team
├── id              UUID
├── org_id          UUID
├── name            string
└── created_at      timestamp

TeamMember¶

TeamMember
├── team_id         UUID
├── user_id         UUID
└── role            enum (admin | member)

KnowledgeFact¶

KnowledgeFact
├── id              UUID
├── scope           enum (user | team | org)
├── scope_id        UUID (user_id, team_id, or org_id)
├── content         string
├── source          enum (extracted | manual)
├── interaction_id  UUID | null
├── created_at      timestamp
└── updated_at      timestamp

Interaction¶

Interaction
├── id              UUID
├── user_id         UUID
├── team_id         UUID | null
├── org_id          UUID | null
├── session_id      UUID
├── status          enum (in_progress | complete | failed)
├── created_at      timestamp
└── completed_at    timestamp | null

Step¶

Step
├── id              UUID
├── interaction_id  UUID
├── type            enum (think | respond | extract)
├── status          enum (running | complete | failed)
├── created_at      timestamp
└── completed_at    timestamp | null

10. What This Tests¶

Deploying and using the foundational agent validates the following platform capabilities:

Capability	How it is tested
User authentication and identity	User must be authenticated to access the agent
Multi-tenant data isolation	User knowledge must not be visible to other users
Team and org membership	Agent correctly scopes knowledge retrieval to the user's team and org
Context assembly	Relevant facts appear in responses without the user re-stating them
LLM integration	Claude Sonnet 4.6 is called correctly and responses are returned
Response streaming	Tokens stream to the client as generated
Interaction recording	Interaction and Step records are created for each conversation turn
Knowledge extraction	Facts are extracted from conversations and written to the correct layer
Knowledge persistence	Facts extracted in one session are available in the next
Knowledge layering	User, team, and org facts are correctly prioritised and combined

11. Known Limitations (v1)¶

These are accepted limitations for the foundational agent, to be addressed in subsequent iterations:

Text search only — knowledge retrieval uses keyword matching, not semantic/vector search. Retrieval quality will be lower for conceptually related queries that don't share keywords. This is acceptable for testing purposes; vector search is a planned improvement.
No cross-session history — the agent does not remember conversation history across sessions. Continuity relies entirely on knowledge extraction. This means context from a single session may be lost if extraction misses something.
No tools — the agent cannot take actions, search the web, read files, or interact with external systems. It can only converse and learn.
No governance enforcement — the policy engine, capability levels, and budget controls are not active. This is appropriate for internal testing but must be addressed before any broader deployment.
Manual knowledge correction — there is no UI for users to review, edit, or delete extracted knowledge facts in v1. This will be needed before the agent is used by anyone other than the development team.
Single agent — there is one agent instance. Multi-agent composition, delegation, and the Agent Studio are all deferred.

12. Success Criteria¶

The foundational agent is considered successfully deployed when:

A user can send a message and receive a coherent response
A fact stated in one conversation is recalled accurately in a subsequent conversation
User, team, and org knowledge facts are correctly scoped — a user's personal context is not shown to their colleagues
Interaction and Step records are created for every conversation turn
Knowledge extraction correctly classifies at least 80% of extracted facts into the right layer (validated by manual review of a test conversation set)
The agent handles a conversation of 20+ turns without context degradation

This agent is the starting point. Once it is running and validated, the next agent to build is the one that will be most useful to the development team during the build — likely the Taskmaster or a lightly configured version of the Personal Assistant.