Skip to content

Report Writer Agent

Thinklio Built-in Agent Specification Version 0.1 | March 2026


1. Purpose and Problem Statement

The Report Writer Agent is a renderer. It takes a verified draft — a structured piece of content that has passed through the Writer and Fact Checker — and produces a finished output artefact: a formatted markdown file, a PDF, or both.

It is not a writing agent. It does not change content. Its responsibilities are:

  • Applying a layout template to produce a visually structured document
  • Resolving image placeholders (either with actual assets or as annotated placeholders)
  • Producing a PDF via DocRaptor where required
  • Storing the output artefact in the media system with metadata
  • Generating a summary of the artefact suitable for display
  • Tokenising the artefact content for semantic search

The Report Writer is the last step in the standard pipeline before content is considered published or archived.


2. Position in the Pipeline

Fact Checker  →  [Report Writer]  →  Media System

It may also be invoked directly by a user who has a completed draft and simply needs it rendered and stored.


3. Media System

The media system is Thinklio's artefact store. It holds rendered output files — not drafts or working documents, which live in the Notes layer of the data model.

Each artefact in the media system has:

Artefact
├── artefact_id         UUID
├── title               string
├── description         string (short summary, auto-generated)
├── file_type           enum (markdown | pdf | both)
├── storage_url         string (internal URL to file)
├── thumbnail_url       string | null
├── created_at          timestamp
├── created_by          UUID (user or agent)
├── source_draft_id     UUID
├── source_report_id    UUID
├── tags                string[]
├── embedding           float[] (semantic search vector)
├── embedding_model     string
├── linked_records[]
│   ├── record_type     enum (Task | Item | Organisation | Person | Note)
│   └── record_id       UUID
└── published           boolean

The artefact store is separate from the Notes layer because artefacts are finished outputs with version identity, not editable working documents.


4. Configuration

4.1 Admin Configuration

Setting Description
DocRaptor API key Required for PDF generation
Default PDF template Workspace-level CSS/HTML template used by DocRaptor
Storage backend Where artefact files are stored (e.g. Cloudflare R2, S3)
Embedding model Which model to use for semantic tokenisation
Auto-publish Whether artefacts are published immediately or held for review
Thumbnail generation Whether PDF thumbnails are auto-generated on render

4.2 Run-time Parameters

Parameter Type Description
draft Draft The verified draft to render
output_formats enum[] markdown, pdf, or both
layout_template reference Named layout template. Determines CSS/typography for PDF, and heading/section conventions for markdown.
image_assets ImageAsset[] Optional actual image assets to substitute for placeholder tags. Unresolved placeholders remain as annotated blocks.
summary_length enum short (1–2 sentences) or long (short paragraph). Used for artefact description.
tags string[] Tags to apply to the artefact in the media system.
link_to reference[] Records to link the artefact to in the data model.
publish boolean Whether to mark the artefact as published immediately. Overrides admin auto-publish setting.
language string BCP 47 tag. Used for PDF language metadata.

5. Layout Templates

A layout template is a named configuration that controls how the rendered output looks. For PDF output, it includes a DocRaptor-compatible CSS stylesheet and HTML wrapper. For markdown, it defines heading hierarchy conventions, section separator styles, and metadata block placement.

Template properties:

LayoutTemplate
├── template_id         UUID
├── name                string
├── description         string
├── markdown_config
│   ├── frontmatter     boolean (include YAML frontmatter)
│   ├── toc             boolean (include table of contents)
│   └── section_rules   object
└── pdf_config
    ├── css             string (DocRaptor stylesheet)
    ├── page_size       enum (A4 | letter)
    ├── margins         object
    ├── header_html     string | null
    └── footer_html     string | null

Templates are created by workspace admins. A default template is provided at workspace setup.


6. Image Placeholder Handling

The Writer Agent inserts image placeholders as structured tags:

{{IMAGE: alt="..." suggestion="..." placement="..."}}

The Report Writer handles these in order of preference:

  1. Asset provided — if the image_assets parameter includes an asset matching the placeholder's placement or alt text, it is substituted inline
  2. Asset not provided — the placeholder is rendered as a visible block in the output:
  3. In markdown: a blockquote with the suggestion text and an [IMAGE PLACEHOLDER] label
  4. In PDF: a styled box with the suggestion text and a visual indicator

Unresolved placeholders are recorded in the artefact metadata so a user or subsequent process can fill them later without regenerating the whole document.


7. PDF Generation via DocRaptor

PDF rendering uses DocRaptor's HTML-to-PDF service. The process:

  1. The draft's markdown content is converted to HTML using the layout template's wrapper and CSS
  2. Image placeholders are rendered as styled HTML blocks if unresolved
  3. The HTML is submitted to DocRaptor's API
  4. The returned PDF binary is stored in the configured storage backend
  5. A thumbnail is generated from the first page if thumbnail generation is enabled

DocRaptor supports prince-pdf rendering, giving high-quality typographic output from CSS. The CSS stylesheet in the layout template should be authored accordingly.

Failure modes: - DocRaptor API error: retry once, then return a partial artefact (markdown only) with a failed PDF flag - Storage write failure: retry with exponential backoff; the artefact is not marked as complete until both file and metadata are confirmed written


8. Semantic Tokenisation

When an artefact is stored, its content is passed to the configured embedding model to produce a vector embedding. This embedding is stored with the artefact and indexed for semantic search.

This enables: - Searching the media library by meaning rather than just tags or title - Surfacing relevant past artefacts when a new research or writing task begins - Detecting near-duplicate content before publishing

The embedding is generated from the full text of the artefact, not just the summary. Section boundaries are preserved in the embedding process where the model supports chunked input.


9. Output Structure

RenderedArtefact
├── artefact_id         UUID
├── markdown_url        string | null
├── pdf_url             string | null
├── thumbnail_url       string | null
├── summary             string
├── unresolved_placeholders[]
│   ├── tag             string
│   ├── suggestion      string
│   └── placement       string
├── embedding_status    enum (complete | pending | failed)
└── publish_status      enum (published | held | failed)

10. User Interface

10.1 Configuration Screen

  • Draft input: select a verified draft from saved drafts, or from a pipeline run
  • Output formats: checkboxes (Markdown, PDF)
  • Layout template: dropdown with preview thumbnail
  • Image assets: optional upload or select from media library; matched to placeholders automatically by placement/alt
  • Summary length: toggle
  • Tags: tag input
  • Link to records: record picker (multi-select)
  • Publish immediately: toggle

10.2 Progress View

  • Status messages ("Rendering markdown…", "Submitting to DocRaptor…", "Generating embedding…")
  • Estimated time for PDF generation (DocRaptor adds latency)
  • Cancel button (cancels before submission to DocRaptor; cannot cancel mid-render)

10.3 Artefact View

Shown on completion:

  • Rendered markdown preview
  • PDF preview (embedded viewer or download link)
  • Unresolved placeholder list with "Add image" action per placeholder
  • Summary shown as it will appear in the media library
  • Tags and linked records
  • Publish / hold toggle
  • Edit metadata (title, tags, description)
  • Link to artefact in media library

10.4 Media Library View

A separate interface for browsing and managing artefacts:

  • Grid or list view of artefacts with thumbnails
  • Filter by tags, type, date, linked record, publish status
  • Semantic search bar (queries against embeddings)
  • Artefact detail view with full metadata, linked records, version history
  • Download (markdown / PDF)
  • Unpublish / archive / delete

11. Data Model Integration

Data Object Interaction
Task Artefact linked to the Task that initiated the pipeline
Item Rendered response to an Item saved as an artefact
Organisation Published content linked to the relevant org record
Person Author attribution stored against Person record
Note Draft that was input to the Report Writer remains as a Note; the artefact is a separate output

12. Use Cases

UC-1: Newsletter PDF production

A coordinator passes a verified newsletter draft to the Report Writer. It renders the markdown, generates a PDF via DocRaptor using the newsletter layout template, stores both in the media system, generates an embedding, and links the artefact to the relevant Issue Task. Three image placeholders remain unresolved and are flagged for manual action.

UC-2: Standalone report render

A user has a finished draft Note and wants to produce a PDF for distribution. They open the Report Writer, select the draft, choose the report layout template, upload two images to resolve placeholders, and submit. The PDF is generated and stored. They copy the artefact URL for distribution.

UC-3: Semantic search retrieval

A researcher starting a new content task opens the media library, types a semantic query, and finds three previously produced artefacts closely related to their topic. They attach these to the new Task as background references, avoiding redundant work.


13. Open Questions

  • Should the Report Writer support versioning of artefacts — i.e. can an existing artefact be re-rendered from an updated draft, producing a new version rather than a new record?
  • Thumbnail generation from PDF pages requires a server-side utility (e.g. Ghostscript). Is this feasible in the current infrastructure, or should thumbnails be skipped for the initial release?
  • Embedding model selection affects search quality and cost. Should this be fixed per workspace or selectable per artefact type?
  • Should unresolved image placeholders block publication, or is publication with placeholders permitted? This may need to be a configurable policy.

Previous: Fact Checker Agent | Next: Data Agent