Skip to content

Long AI responses

AI "reasoning" models and long-context LLMs can produce responses that are tens or hundreds of kilobytes of Markdown. Traditional renderers struggle with this volume — Markstream's virtualization keeps the UI responsive.

The problem with long AI responses

When an AI produces a 100 KB Markdown response with code blocks, tables, and diagrams:

  1. DOM nodes explode: every paragraph, list item, and inline element becomes a DOM node
  2. Memory grows linearly: 100 KB of Markdown can produce 5000+ DOM nodes
  3. Heavy blocks multiply: multiple Mermaid diagrams, code blocks, and math expressions all render simultaneously
  4. Scrolling degrades: the browser layout engine struggles with thousands of nodes

Markstream's virtualization

Markstream uses viewport-aware rendering to bound the number of live DOM nodes:

vue
<MarkdownRender
  :content="longResponse"
  :final="isDone"
  node-virtual="auto"
  :max-live-nodes="200"
/>
  • node-virtual: enables node-level virtualization inside this document
  • max-live-nodes: maximum number of simultaneously rendered nodes (default depends on mode)
  • Nodes outside the viewport are replaced with lightweight placeholders
  • As the user scrolls, nodes enter and leave the viewport

virtual-scroll is an advanced protocol for outer timeline virtualizers (e.g. chat message lists). Most users should use node-virtual and max-live-nodes instead of enabling virtual-scroll directly.

Viewport-aware heavy blocks

Mermaid diagrams, Monaco code blocks, and other heavy blocks stay idle while offscreen:

vue
<MarkdownRender
  :content="longDoc"
  viewport-priority
  :defer-nodes-until-visible="true"
/>
  • viewport-priority: heavy blocks render only when approaching the viewport
  • defer-nodes-until-visible: nodes outside viewport use minimal resources
  • As the user scrolls toward a Mermaid diagram, it activates and renders

Modes and virtualization

Different renderer modes have different virtualization defaults:

ModeVirtualizationBest for
chatLightweight, no virtualization by defaultShort to medium chat messages
docsVirtualization enabled by defaultLong technical documents
minimalSame as chat, neutral nameNon-chat surfaces, lightweight

For AI chat with potentially long responses, you can combine chat-mode pacing with docs-mode virtualization:

vue
<MarkdownRender
  mode="chat"
  :content="longResponse"
  :final="isDone"
  smooth-streaming="auto"
  :fade="false"
  node-virtual="auto"
  :max-live-nodes="300"
/>

Performance benchmarks

The numbers below are illustrative targets for planning. Run pnpm benchmark:1.0 on your target devices before using performance numbers as release criteria, and use the 1.0 Benchmark Report for reproducible release-gate methodology.

Content sizeWithout virtualizationWith virtualization
10 KB (short chat)~200 DOM nodes, smooth~200 DOM nodes, smooth
50 KB (long answer)~1000 DOM nodes, slight lag~300 live nodes, smooth
100 KB (reasoning)~2500 DOM nodes, noticeable lag~300 live nodes, smooth
1 MB (technical doc)May freeze browser~500 live nodes, scrollable

When virtualization matters

Enable virtualization when:

  • AI responses routinely exceed 20 KB
  • Users report laggy scrolling in long conversations
  • Responses contain multiple Mermaid diagrams or code blocks
  • The app runs on mobile devices with limited memory

Skip virtualization when:

  • Responses are always short (< 10 KB)
  • You want the simplest possible setup
  • The target device is a desktop with abundant memory

See also