Long AI responses
AI "reasoning" models and long-context LLMs can produce responses that are tens or hundreds of kilobytes of Markdown. Traditional renderers struggle with this volume — Markstream's virtualization keeps the UI responsive.
The problem with long AI responses
When an AI produces a 100 KB Markdown response with code blocks, tables, and diagrams:
- DOM nodes explode: every paragraph, list item, and inline element becomes a DOM node
- Memory grows linearly: 100 KB of Markdown can produce 5000+ DOM nodes
- Heavy blocks multiply: multiple Mermaid diagrams, code blocks, and math expressions all render simultaneously
- Scrolling degrades: the browser layout engine struggles with thousands of nodes
Markstream's virtualization
Markstream uses viewport-aware rendering to bound the number of live DOM nodes:
<MarkdownRender
:content="longResponse"
:final="isDone"
node-virtual="auto"
:max-live-nodes="200"
/>node-virtual: enables node-level virtualization inside this documentmax-live-nodes: maximum number of simultaneously rendered nodes (default depends on mode)- Nodes outside the viewport are replaced with lightweight placeholders
- As the user scrolls, nodes enter and leave the viewport
virtual-scroll is an advanced protocol for outer timeline virtualizers (e.g. chat message lists). Most users should use node-virtual and max-live-nodes instead of enabling virtual-scroll directly.
Viewport-aware heavy blocks
Mermaid diagrams, Monaco code blocks, and other heavy blocks stay idle while offscreen:
<MarkdownRender
:content="longDoc"
viewport-priority
:defer-nodes-until-visible="true"
/>viewport-priority: heavy blocks render only when approaching the viewportdefer-nodes-until-visible: nodes outside viewport use minimal resources- As the user scrolls toward a Mermaid diagram, it activates and renders
Modes and virtualization
Different renderer modes have different virtualization defaults:
| Mode | Virtualization | Best for |
|---|---|---|
chat | Lightweight, no virtualization by default | Short to medium chat messages |
docs | Virtualization enabled by default | Long technical documents |
minimal | Same as chat, neutral name | Non-chat surfaces, lightweight |
For AI chat with potentially long responses, you can combine chat-mode pacing with docs-mode virtualization:
<MarkdownRender
mode="chat"
:content="longResponse"
:final="isDone"
smooth-streaming="auto"
:fade="false"
node-virtual="auto"
:max-live-nodes="300"
/>Performance benchmarks
The numbers below are illustrative targets for planning. Run pnpm benchmark:1.0 on your target devices before using performance numbers as release criteria, and use the 1.0 Benchmark Report for reproducible release-gate methodology.
| Content size | Without virtualization | With virtualization |
|---|---|---|
| 10 KB (short chat) | ~200 DOM nodes, smooth | ~200 DOM nodes, smooth |
| 50 KB (long answer) | ~1000 DOM nodes, slight lag | ~300 live nodes, smooth |
| 100 KB (reasoning) | ~2500 DOM nodes, noticeable lag | ~300 live nodes, smooth |
| 1 MB (technical doc) | May freeze browser | ~500 live nodes, scrollable |
When virtualization matters
Enable virtualization when:
- AI responses routinely exceed 20 KB
- Users report laggy scrolling in long conversations
- Responses contain multiple Mermaid diagrams or code blocks
- The app runs on mobile devices with limited memory
Skip virtualization when:
- Responses are always short (< 10 KB)
- You want the simplest possible setup
- The target device is a desktop with abundant memory