Report Writer Agent¶
Thinklio Built-in Agent Specification Version 0.1 | March 2026
1. Purpose and Problem Statement¶
The Report Writer Agent is a renderer. It takes a verified draft — a structured piece of content that has passed through the Writer and Fact Checker — and produces a finished output artefact: a formatted markdown file, a PDF, or both.
It is not a writing agent. It does not change content. Its responsibilities are:
- Applying a layout template to produce a visually structured document
- Resolving image placeholders (either with actual assets or as annotated placeholders)
- Producing a PDF via DocRaptor where required
- Storing the output artefact in the media system with metadata
- Generating a summary of the artefact suitable for display
- Tokenising the artefact content for semantic search
The Report Writer is the last step in the standard pipeline before content is considered published or archived.
2. Position in the Pipeline¶
It may also be invoked directly by a user who has a completed draft and simply needs it rendered and stored.
3. Media System¶
The media system is Thinklio's artefact store. It holds rendered output files — not drafts or working documents, which live in the Notes layer of the data model.
Each artefact in the media system has:
Artefact
├── artefact_id UUID
├── title string
├── description string (short summary, auto-generated)
├── file_type enum (markdown | pdf | both)
├── storage_url string (internal URL to file)
├── thumbnail_url string | null
├── created_at timestamp
├── created_by UUID (user or agent)
├── source_draft_id UUID
├── source_report_id UUID
├── tags string[]
├── embedding float[] (semantic search vector)
├── embedding_model string
├── linked_records[]
│ ├── record_type enum (Task | Item | Organisation | Person | Note)
│ └── record_id UUID
└── published boolean
The artefact store is separate from the Notes layer because artefacts are finished outputs with version identity, not editable working documents.
4. Configuration¶
4.1 Admin Configuration¶
| Setting | Description |
|---|---|
| DocRaptor API key | Required for PDF generation |
| Default PDF template | Workspace-level CSS/HTML template used by DocRaptor |
| Storage backend | Where artefact files are stored (e.g. Cloudflare R2, S3) |
| Embedding model | Which model to use for semantic tokenisation |
| Auto-publish | Whether artefacts are published immediately or held for review |
| Thumbnail generation | Whether PDF thumbnails are auto-generated on render |
4.2 Run-time Parameters¶
| Parameter | Type | Description |
|---|---|---|
draft |
Draft | The verified draft to render |
output_formats |
enum[] | markdown, pdf, or both |
layout_template |
reference | Named layout template. Determines CSS/typography for PDF, and heading/section conventions for markdown. |
image_assets |
ImageAsset[] | Optional actual image assets to substitute for placeholder tags. Unresolved placeholders remain as annotated blocks. |
summary_length |
enum | short (1–2 sentences) or long (short paragraph). Used for artefact description. |
tags |
string[] | Tags to apply to the artefact in the media system. |
link_to |
reference[] | Records to link the artefact to in the data model. |
publish |
boolean | Whether to mark the artefact as published immediately. Overrides admin auto-publish setting. |
language |
string | BCP 47 tag. Used for PDF language metadata. |
5. Layout Templates¶
A layout template is a named configuration that controls how the rendered output looks. For PDF output, it includes a DocRaptor-compatible CSS stylesheet and HTML wrapper. For markdown, it defines heading hierarchy conventions, section separator styles, and metadata block placement.
Template properties:
LayoutTemplate
├── template_id UUID
├── name string
├── description string
├── markdown_config
│ ├── frontmatter boolean (include YAML frontmatter)
│ ├── toc boolean (include table of contents)
│ └── section_rules object
└── pdf_config
├── css string (DocRaptor stylesheet)
├── page_size enum (A4 | letter)
├── margins object
├── header_html string | null
└── footer_html string | null
Templates are created by workspace admins. A default template is provided at workspace setup.
6. Image Placeholder Handling¶
The Writer Agent inserts image placeholders as structured tags:
The Report Writer handles these in order of preference:
- Asset provided — if the
image_assetsparameter includes an asset matching the placeholder's placement or alt text, it is substituted inline - Asset not provided — the placeholder is rendered as a visible block in the output:
- In markdown: a blockquote with the suggestion text and an
[IMAGE PLACEHOLDER]label - In PDF: a styled box with the suggestion text and a visual indicator
Unresolved placeholders are recorded in the artefact metadata so a user or subsequent process can fill them later without regenerating the whole document.
7. PDF Generation via DocRaptor¶
PDF rendering uses DocRaptor's HTML-to-PDF service. The process:
- The draft's markdown content is converted to HTML using the layout template's wrapper and CSS
- Image placeholders are rendered as styled HTML blocks if unresolved
- The HTML is submitted to DocRaptor's API
- The returned PDF binary is stored in the configured storage backend
- A thumbnail is generated from the first page if thumbnail generation is enabled
DocRaptor supports prince-pdf rendering, giving high-quality typographic output from CSS. The CSS stylesheet in the layout template should be authored accordingly.
Failure modes: - DocRaptor API error: retry once, then return a partial artefact (markdown only) with a failed PDF flag - Storage write failure: retry with exponential backoff; the artefact is not marked as complete until both file and metadata are confirmed written
8. Semantic Tokenisation¶
When an artefact is stored, its content is passed to the configured embedding model to produce a vector embedding. This embedding is stored with the artefact and indexed for semantic search.
This enables: - Searching the media library by meaning rather than just tags or title - Surfacing relevant past artefacts when a new research or writing task begins - Detecting near-duplicate content before publishing
The embedding is generated from the full text of the artefact, not just the summary. Section boundaries are preserved in the embedding process where the model supports chunked input.
9. Output Structure¶
RenderedArtefact
├── artefact_id UUID
├── markdown_url string | null
├── pdf_url string | null
├── thumbnail_url string | null
├── summary string
├── unresolved_placeholders[]
│ ├── tag string
│ ├── suggestion string
│ └── placement string
├── embedding_status enum (complete | pending | failed)
└── publish_status enum (published | held | failed)
10. User Interface¶
10.1 Configuration Screen¶
- Draft input: select a verified draft from saved drafts, or from a pipeline run
- Output formats: checkboxes (Markdown, PDF)
- Layout template: dropdown with preview thumbnail
- Image assets: optional upload or select from media library; matched to placeholders automatically by placement/alt
- Summary length: toggle
- Tags: tag input
- Link to records: record picker (multi-select)
- Publish immediately: toggle
10.2 Progress View¶
- Status messages ("Rendering markdown…", "Submitting to DocRaptor…", "Generating embedding…")
- Estimated time for PDF generation (DocRaptor adds latency)
- Cancel button (cancels before submission to DocRaptor; cannot cancel mid-render)
10.3 Artefact View¶
Shown on completion:
- Rendered markdown preview
- PDF preview (embedded viewer or download link)
- Unresolved placeholder list with "Add image" action per placeholder
- Summary shown as it will appear in the media library
- Tags and linked records
- Publish / hold toggle
- Edit metadata (title, tags, description)
- Link to artefact in media library
10.4 Media Library View¶
A separate interface for browsing and managing artefacts:
- Grid or list view of artefacts with thumbnails
- Filter by tags, type, date, linked record, publish status
- Semantic search bar (queries against embeddings)
- Artefact detail view with full metadata, linked records, version history
- Download (markdown / PDF)
- Unpublish / archive / delete
11. Data Model Integration¶
| Data Object | Interaction |
|---|---|
| Task | Artefact linked to the Task that initiated the pipeline |
| Item | Rendered response to an Item saved as an artefact |
| Organisation | Published content linked to the relevant org record |
| Person | Author attribution stored against Person record |
| Note | Draft that was input to the Report Writer remains as a Note; the artefact is a separate output |
12. Use Cases¶
UC-1: Newsletter PDF production¶
A coordinator passes a verified newsletter draft to the Report Writer. It renders the markdown, generates a PDF via DocRaptor using the newsletter layout template, stores both in the media system, generates an embedding, and links the artefact to the relevant Issue Task. Three image placeholders remain unresolved and are flagged for manual action.
UC-2: Standalone report render¶
A user has a finished draft Note and wants to produce a PDF for distribution. They open the Report Writer, select the draft, choose the report layout template, upload two images to resolve placeholders, and submit. The PDF is generated and stored. They copy the artefact URL for distribution.
UC-3: Semantic search retrieval¶
A researcher starting a new content task opens the media library, types a semantic query, and finds three previously produced artefacts closely related to their topic. They attach these to the new Task as background references, avoiding redundant work.
13. Open Questions¶
- Should the Report Writer support versioning of artefacts — i.e. can an existing artefact be re-rendered from an updated draft, producing a new version rather than a new record?
- Thumbnail generation from PDF pages requires a server-side utility (e.g. Ghostscript). Is this feasible in the current infrastructure, or should thumbnails be skipped for the initial release?
- Embedding model selection affects search quality and cost. Should this be fixed per workspace or selectable per artefact type?
- Should unresolved image placeholders block publication, or is publication with placeholders permitted? This may need to be a configurable policy.
Previous: Fact Checker Agent | Next: Data Agent