Report Writer Agent¶

Thinklio Built-in Agent Specification Version 0.1 | March 2026

1. Purpose and Problem Statement¶

The Report Writer Agent is a renderer. It takes a verified draft — a structured piece of content that has passed through the Writer and Fact Checker — and produces a finished output artefact: a formatted markdown file, a PDF, or both.

It is not a writing agent. It does not change content. Its responsibilities are:

Applying a layout template to produce a visually structured document
Resolving image placeholders (either with actual assets or as annotated placeholders)
Producing a PDF via DocRaptor where required
Storing the output artefact in the media system with metadata
Generating a summary of the artefact suitable for display
Tokenising the artefact content for semantic search

The Report Writer is the last step in the standard pipeline before content is considered published or archived.

2. Position in the Pipeline¶

Fact Checker  →  [Report Writer]  →  Media System

It may also be invoked directly by a user who has a completed draft and simply needs it rendered and stored.

3. Media System¶

The media system is Thinklio's artefact store. It holds rendered output files — not drafts or working documents, which live in the Notes layer of the data model.

Each artefact in the media system has:

Artefact
├── artefact_id         UUID
├── title               string
├── description         string (short summary, auto-generated)
├── file_type           enum (markdown | pdf | both)
├── storage_url         string (internal URL to file)
├── thumbnail_url       string | null
├── created_at          timestamp
├── created_by          UUID (user or agent)
├── source_draft_id     UUID
├── source_report_id    UUID
├── tags                string[]
├── embedding           float[] (semantic search vector)
├── embedding_model     string
├── linked_records[]
│   ├── record_type     enum (Task | Item | Organisation | Person | Note)
│   └── record_id       UUID
└── published           boolean

The artefact store is separate from the Notes layer because artefacts are finished outputs with version identity, not editable working documents.

4. Configuration¶

4.1 Admin Configuration¶

Setting	Description
DocRaptor API key	Required for PDF generation
Default PDF template	Workspace-level CSS/HTML template used by DocRaptor
Storage backend	Where artefact files are stored (e.g. Cloudflare R2, S3)
Embedding model	Which model to use for semantic tokenisation
Auto-publish	Whether artefacts are published immediately or held for review
Thumbnail generation	Whether PDF thumbnails are auto-generated on render

4.2 Run-time Parameters¶

Parameter	Type	Description
`draft`	Draft	The verified draft to render
`output_formats`	enum[]	`markdown`, `pdf`, or both
`layout_template`	reference	Named layout template. Determines CSS/typography for PDF, and heading/section conventions for markdown.
`image_assets`	ImageAsset[]	Optional actual image assets to substitute for placeholder tags. Unresolved placeholders remain as annotated blocks.
`summary_length`	enum	`short` (1–2 sentences) or `long` (short paragraph). Used for artefact description.
`tags`	string[]	Tags to apply to the artefact in the media system.
`link_to`	reference[]	Records to link the artefact to in the data model.
`publish`	boolean	Whether to mark the artefact as published immediately. Overrides admin auto-publish setting.
`language`	string	BCP 47 tag. Used for PDF language metadata.

5. Layout Templates¶

A layout template is a named configuration that controls how the rendered output looks. For PDF output, it includes a DocRaptor-compatible CSS stylesheet and HTML wrapper. For markdown, it defines heading hierarchy conventions, section separator styles, and metadata block placement.

Template properties:

LayoutTemplate
├── template_id         UUID
├── name                string
├── description         string
├── markdown_config
│   ├── frontmatter     boolean (include YAML frontmatter)
│   ├── toc             boolean (include table of contents)
│   └── section_rules   object
└── pdf_config
    ├── css             string (DocRaptor stylesheet)
    ├── page_size       enum (A4 | letter)
    ├── margins         object
    ├── header_html     string | null
    └── footer_html     string | null

Templates are created by workspace admins. A default template is provided at workspace setup.

6. Image Placeholder Handling¶

The Writer Agent inserts image placeholders as structured tags:

{{IMAGE: alt="..." suggestion="..." placement="..."}}

The Report Writer handles these in order of preference:

Asset provided — if the image_assets parameter includes an asset matching the placeholder's placement or alt text, it is substituted inline
Asset not provided — the placeholder is rendered as a visible block in the output:
In markdown: a blockquote with the suggestion text and an [IMAGE PLACEHOLDER] label
In PDF: a styled box with the suggestion text and a visual indicator

Unresolved placeholders are recorded in the artefact metadata so a user or subsequent process can fill them later without regenerating the whole document.

7. PDF Generation via DocRaptor¶

PDF rendering uses DocRaptor's HTML-to-PDF service. The process:

The draft's markdown content is converted to HTML using the layout template's wrapper and CSS
Image placeholders are rendered as styled HTML blocks if unresolved
The HTML is submitted to DocRaptor's API
The returned PDF binary is stored in the configured storage backend
A thumbnail is generated from the first page if thumbnail generation is enabled

DocRaptor supports prince-pdf rendering, giving high-quality typographic output from CSS. The CSS stylesheet in the layout template should be authored accordingly.

Failure modes: - DocRaptor API error: retry once, then return a partial artefact (markdown only) with a failed PDF flag - Storage write failure: retry with exponential backoff; the artefact is not marked as complete until both file and metadata are confirmed written

8. Semantic Tokenisation¶

When an artefact is stored, its content is passed to the configured embedding model to produce a vector embedding. This embedding is stored with the artefact and indexed for semantic search.

This enables: - Searching the media library by meaning rather than just tags or title - Surfacing relevant past artefacts when a new research or writing task begins - Detecting near-duplicate content before publishing

The embedding is generated from the full text of the artefact, not just the summary. Section boundaries are preserved in the embedding process where the model supports chunked input.

9. Output Structure¶

RenderedArtefact
├── artefact_id         UUID
├── markdown_url        string | null
├── pdf_url             string | null
├── thumbnail_url       string | null
├── summary             string
├── unresolved_placeholders[]
│   ├── tag             string
│   ├── suggestion      string
│   └── placement       string
├── embedding_status    enum (complete | pending | failed)
└── publish_status      enum (published | held | failed)

10. User Interface¶

10.1 Configuration Screen¶

Draft input: select a verified draft from saved drafts, or from a pipeline run
Output formats: checkboxes (Markdown, PDF)
Layout template: dropdown with preview thumbnail
Image assets: optional upload or select from media library; matched to placeholders automatically by placement/alt
Summary length: toggle
Tags: tag input
Link to records: record picker (multi-select)
Publish immediately: toggle

10.2 Progress View¶

Status messages ("Rendering markdown…", "Submitting to DocRaptor…", "Generating embedding…")
Estimated time for PDF generation (DocRaptor adds latency)
Cancel button (cancels before submission to DocRaptor; cannot cancel mid-render)

10.3 Artefact View¶

Shown on completion:

Rendered markdown preview
PDF preview (embedded viewer or download link)
Unresolved placeholder list with "Add image" action per placeholder
Summary shown as it will appear in the media library
Tags and linked records
Publish / hold toggle
Edit metadata (title, tags, description)
Link to artefact in media library

10.4 Media Library View¶

A separate interface for browsing and managing artefacts:

Grid or list view of artefacts with thumbnails
Filter by tags, type, date, linked record, publish status
Semantic search bar (queries against embeddings)
Artefact detail view with full metadata, linked records, version history
Download (markdown / PDF)
Unpublish / archive / delete

11. Data Model Integration¶

Data Object	Interaction
Task	Artefact linked to the Task that initiated the pipeline
Item	Rendered response to an Item saved as an artefact
Organisation	Published content linked to the relevant org record
Person	Author attribution stored against Person record
Note	Draft that was input to the Report Writer remains as a Note; the artefact is a separate output

12. Use Cases¶

A coordinator passes a verified newsletter draft to the Report Writer. It renders the markdown, generates a PDF via DocRaptor using the newsletter layout template, stores both in the media system, generates an embedding, and links the artefact to the relevant Issue Task. Three image placeholders remain unresolved and are flagged for manual action.

UC-2: Standalone report render¶

A user has a finished draft Note and wants to produce a PDF for distribution. They open the Report Writer, select the draft, choose the report layout template, upload two images to resolve placeholders, and submit. The PDF is generated and stored. They copy the artefact URL for distribution.

UC-3: Semantic search retrieval¶

A researcher starting a new content task opens the media library, types a semantic query, and finds three previously produced artefacts closely related to their topic. They attach these to the new Task as background references, avoiding redundant work.

13. Open Questions¶

Should the Report Writer support versioning of artefacts — i.e. can an existing artefact be re-rendered from an updated draft, producing a new version rather than a new record?
Thumbnail generation from PDF pages requires a server-side utility (e.g. Ghostscript). Is this feasible in the current infrastructure, or should thumbnails be skipped for the initial release?
Embedding model selection affects search quality and cost. Should this be fixed per workspace or selectable per artefact type?
Should unresolved image placeholders block publication, or is publication with placeholders permitted? This may need to be a configurable policy.

Previous: Fact Checker Agent | Next: Data Agent