KaTeX Worker Performance Playbook
Looking for the Chinese version? See KaTeX Worker Performance Guide (ZH).
Question: is a Worker actually faster than rendering on the main thread?
Use this guide to decide when the Worker + cache pipeline is worth enabling.
Short answer
Yes. A Worker backed by a cache easily wins in most real workloads.
Why:
- The cache eliminates ~99 % of the cost (cache hit rate commonly >70 %).
- The Worker keeps the main thread responsive, so scrolling/typing stays smooth even when formulas are heavy.
- Memory overhead is tiny -- roughly 10-50 KB for ~200 cached formulas.
Quick performance comparisons
Scenario 1 - single lightweight formula
Direct render: ~2-5 ms
Worker: ~3-7 ms (includes postMessage overhead)
Takeaway: Worker is slightly slower but the difference is negligible.Scenario 2 - single complex formula
Direct render: ~20-50 ms (blocks the main thread)
Worker: ~22-52 ms (main thread stays free)
Takeaway: UX improves because the page never freezes.Scenario 3 - repeated formula with cache
Direct render: 5 ms x 10 renders = 50 ms
Worker+cache: 5 ms + 0.01 ms x 9 hits = 5.09 ms
Takeaway: ~10x faster once cached.Scenario 4 - mixed real document
50 formulas with 35 duplicates:
- No cache: 250 ms (every formula rerenders)
- With cache: 75 ms (only 15 "unique" renders)
- Cache hit rate: 70 %
- Speedup: ~3.3xHow to benchmark
1. Use the built-in Vitest benchmark
bash
pnpm install
pnpm test test/benchmark/katex-worker-vs-direct.test.ts
pnpm test test/benchmark/katex-worker-vs-direct.test.ts -- --reporter=verbose2. Estimate the "switch to Worker" threshold
Compute how many unique formulas (N) you can render on the main thread before risking a noticeable jank:
- Formula:
N ~ floor(B / (R x (1 - H)))B: main-thread budget in ms (use 50 ms for "user sees a hitch" or 16.7 ms for 1 frame).R: average time to render one unique formula.H: cache hit rate (0-1). When first rendering a page, assumeH = 0.
Fast helpers:
bash
node scripts/measure-katex-threshold.mjsts
import { recommendNForSamples, recommendWorkerThreshold } from 'markstream-vue/utils/katex-threshold'
const exactN = recommendWorkerThreshold({ R: 10, H: 0, B: 50 })
const sampleBased = recommendNForSamples(['x', '\\sum_{i=1}^{n}', '\\int f(x) dx'], { H: 0, B: 50 })Practical tips:
- Default to the "medium complexity" threshold.
- First paint: assume
B = 50andH = 0. During scrolling or repeat renders, increaseNbecause the cache hit rate climbs quickly. - If you detect lots of integrals/matrices, pick the conservative threshold (smaller
N).
3. Monitor live traffic
ts
import { enablePerfMonitoring, getPerfReport } from 'markstream-vue/utils/performance-monitor'
enablePerfMonitoring()
setTimeout(() => {
getPerfReport()
}, 30_000)Browser console helpers:
js
window.__katexPerfReport()
window.__katexPerfMonitor.exportMetrics()4. Inspect Chrome DevTools
A. Performance panel
- Open DevTools -> Performance.
- Record while rendering formulas.
- Inspect:
- Main lane -> watch
katex.renderToString. - Worker lane -> ensure work moved off the main thread.
- Long tasks (>50 ms) -> any red markers mean main thread was blocked.
- Main lane -> watch
B. Memory panel
- Take a Heap snapshot after rendering.
- Search for the cache
Map. - Check size:
- <1 MB -> no worries.
5 MB -> lower
CACHE_MAX.
C. Performance Monitor
- Cmd/Ctrl + Shift + P -> "Show Performance Monitor".
- Watch CPU usage, JS heap, Frames while rendering.
Decision matrix
✅ When to prefer Worker + cache
| Scenario | Rationale |
|---|---|
| Complex math (>10 ms) | Keeps UI responsive. |
| >5 formulas per page | Cache savings stack up. |
| Lots of repetitions | Cache hit rate skyrockets. |
| Smooth scrolling/typing required | No main-thread stalls. |
| Mobile devices | CPUs are weaker, so avoid blocking. |
⚠️ When direct render is fine
| Scenario | Rationale |
|---|---|
| Only trivial formulas | <5 ms each, Worker overhead similar. |
| SSR / Node.js | Worker API unavailable. |
| Single formula | Cache never pays off. |
| Extreme bundle constraints | Worker adds a small chunk. |
🎯 Recommended pipeline (already implemented)
try Worker + cache
-> on error / timeout
fallback to direct render
-> on success
store result back in cacheBenefits:
- ✅ Production-safe (there is always a fallback).
- ✅ Fast path takes advantage of caching.
- ✅ Progressive enhancement (Worker is optional).
Real-world measurements
Render time by formula type
| Type | Example | Avg time | Worker benefit |
|---|---|---|---|
| Simple | x = y | 2-3 ms | Low (~1 ms overhead). |
| Medium | \sum_{i=1}^{n} | 5-10 ms | Medium (prevents frame drops). |
| Complex | \int_{-\infty}^{\infty} | 15-30 ms | High (avoids jank). |
| Matrix | \begin{pmatrix}... | 30-80 ms | Huge (main thread unusable otherwise). |
Cache effectiveness
| Case | First render | Cache hit | Speedup |
|---|---|---|---|
Variable x | 2 ms | 0.005 ms | 400x |
| Summation | 10 ms | 0.008 ms | 1250x |
| Complex integral | 30 ms | 0.01 ms | 3000x |
Sample document (50 formulas, 15 unique)
| Strategy | Total time | Main-thread block | UX |
|---|---|---|---|
| No optimization | 250 ms | 250 ms | ⚠️ Noticeable hitching. |
| Worker only | 265 ms | 0 ms | ✅ Smooth but slower. |
| Worker + cache | 78 ms | 0 ms | ✅✅ Fast and smooth. |
Memory footprint
Input formula: ~30 bytes
HTML output: ~150 bytes
Expansion ratio: ~5x
One cache entry: ~180 bytes (with key)
200 entries: ~36 KBConclusion: memory cost is negligible.
Optimization recipes
1. Tune cache size
ts
// inside katexWorkerClient.ts
const CACHE_MAX = 500 // e.g. bump from 200 to 500 for more unique formulas2. Pre-render frequent formulas
ts
import { setKaTeXCache } from 'markstream-vue/workers/katexWorkerClient'
const commonFormulas = ['x', 'y', 'E=mc^2', '\\sum_{i=1}^{n}']
for (const formula of commonFormulas) {
requestIdleCallback(() => {
renderAndCache(formula)
})
}3. Use requestIdleCallback
ts
if ('requestIdleCallback' in window) {
requestIdleCallback(() => {
renderKaTeXInWorker(formula)
})
}Key takeaways
- Worker overhead is tiny (~1-2 ms).
- Cache hit rates >70 % are normal, so caching is the real win.
- Worker + cache + fallback is the optimal combo.
- Memory costs stay under ~100 KB even with aggressive caching.
- Users notice the smoother scrolling much more than the extra kilobytes.
Final recommendation
Keep the existing Worker + cache + fallback architecture.
- ✅ Great performance (cache removes most work).
- ✅ Smooth UX (Worker isolates blocking work).
- ✅ Stable (fallback guarantees output).
- ✅ Memory friendly.
- ✅ Progressive enhancement friendly.
Nothing else needs changing -- the current design is already the sweet spot. 🎉