Parser — API Deep Dive
This page extracts the most important functions, types and helpers from packages/markdown-parser for a focused API reference.
Main functions
getMarkdown(msgId?, options?)
Create a configured markdown-it-ts instance.
Parameters:
msgId(string | optional): identifier for this instanceoptions(GetMarkdownOptions | optional): configuration options
Options include:
plugin: list of Markdown-it pluginsapply: functions to mutate theMarkdownItinstancei18n: translator map or functioncustomHtmlTags: custom HTML-like tags treated as common during streaming mid‑state handling (e.g.['thinking'])
parseMarkdownToStructure(content, md?, options?)
Parse a Markdown string into a streaming-friendly AST used by the renderer.
Parameters:
content(string): markdown to parsemd(MarkdownItCore, optional): a markdown-it-ts instance — created if not providedoptions(ParseOptions, optional): contains transform hooks described below
Tip for custom components: for simple HTML‑like tags such as
<thinking>, prefer the built‑incustomHtmlTags/custom-html-tagsallowlist so the parser emits custom nodes directly. UsepreTransformTokensonly when you need to reshapecontent/attrs. See custom component parsing for details.
Returns: ParsedNode[]
ParseOptions: final (end-of-stream mode)
In streaming contexts (SSE / AI chat), the parser intentionally emits some mid-state (loading) nodes to reduce flicker and keep rendering stable while content is incomplete, for example:
- Unclosed block math (starts with
$$but the closing$$hasn’t arrived yet) - Unclosed code fences (only the opening ``` has arrived)
When you know the message is complete, switch the parser to “final” mode:
const nodes = parseMarkdownToStructure(buffer, md, { final: true })What final: true does:
- Disables mid-state loading behavior for unclosed constructs (so trailing
$$...won’t produce perpetualmath_block/loadingnodes). - Treats EOF as an implicit closing fence so code blocks don’t get stuck in loading.
processTokens(tokens)
Converts raw markdown-it tokens into a processed token list ready for the AST phase.
parseInlineTokens(tokens, md)
Parse inline tokens into inline nodes usable by the renderer.
Configuration & helpers
setDefaultMathOptions(options)
Set global math rendering options for KaTeX. Example:
setDefaultMathOptions({
commands: ['infty', 'perp'],
escapeExclamation: true
})Heuristics and utilities
isMathLike(content)— heuristic to detect whether a string looks like mathfindMatchingClose(src, startIdx, open, close)— find matching delimiterparseFenceToken(token)— parse code fence intoCodeBlockNodenormalizeStandaloneBackslashT(content, options?)— normalize backslash-t sequences in math content
Parse hooks (advanced)
When calling parseMarkdownToStructure you can pass ParseOptions with these hooks:
preTransformTokens?: (tokens: MarkdownToken[]) => MarkdownToken[]— operate immediately after themarkdown-itparser runspostTransformTokens?: (tokens: MarkdownToken[]) => MarkdownToken[]— transform tokens after internal fixespostTransformNodes?: (nodes: ParsedNode[]) => ParsedNode[]— operate on the node tree before rendering
These hooks are also available via parseOptions prop on the MarkdownRender component (applies only when using content instead of nodes).
ParseOptions: requireClosingStrong
requireClosingStrong (boolean | optional) controls how the parser treats unmatched ** strong delimiters inside inline content. Default: true.
- true: The parser requires a matching closing
**to create a strong node. Unclosed**are left as plain text. This is the recommended, strict mode for non-interactive rendering and avoids incorrect strong parsing inside constructs like link text (for example,[**cxx](xxx)). - false: The parser allows mid-state/unfinished
**(useful for editor live-preview scenarios), which can produce a temporary strong node even when the closing**is missing.
Example — strict parsing (default):
import { parseMarkdownToStructure } from 'stream-markdown-parser'
const nodes = parseMarkdownToStructure('[**cxx](xxx)', undefined, { requireClosingStrong: true })
// the text `[**cxx](xxx)` will be preserved without creating a dangling strong nodeExample — editor-friendly parsing:
const nodes = parseMarkdownToStructure('[**cxx](xxx)', undefined, { requireClosingStrong: false })
// allows creating a temporary/"mid-state" strong node for live-edit previewsStray $$ delimiters (empty math)
Sometimes streams contain an accidental $$ sequence, e.g.
adasd $$ dasdsa
dasdsadIf $$ is misinterpreted as a math delimiter with empty content, the renderer can appear “stuck” because there is no math content to render. The parser now treats empty $$ as plain text to avoid perpetual loading spinners.
Streaming inline HTML mid-states (html_inline)
For streaming/chat/editor scenarios, inline HTML often arrives in partial chunks. The parser applies a conservative mid‑state strategy to reduce flicker while still rendering stable HTML as soon as possible.
Behavior:
- Suppress incomplete tags in text tokens until
>arrives. Examples of suppressed fragments:<span class="a", trailing<,</sp, trailing</. These fragments are not emitted as plain text nodes to avoid jitter. - Recognize complete open tags for common HTML tags (conservative allowlist). Once
<tag ...>is complete, it becomes anhtml_inlinenode. - If no explicit closing tag exists yet, the parser:
- Treats all following inline tokens as that tag’s inner content.
- Auto‑appends
</tag>tocontent/rawso the fragment is valid HTML for rendering. - Marks the node as
loading: trueandautoClosed: trueto indicate the source is still incomplete.
- When the real closing tag arrives,
autoCloseddisappears andloadingbecomesfalse.
Extending the allowlist: To apply the same mid‑state suppression for custom tags (for example <thinking>), pass customHtmlTags when creating the markdown instance:
const md = getMarkdown('chat', { customHtmlTags: ['thinking'] })Emitting custom nodes: If you want those tags to become custom node types (so setCustomComponents can map them directly), also pass customHtmlTags in ParseOptions or use the custom-html-tags prop on MarkdownRender (which wires this automatically):
const nodes = parseMarkdownToStructure(source, md, { customHtmlTags: ['thinking'] })Renderer note:
HtmlInlineNoderenders raw text only whenloading === trueandautoClosed !== true. Auto‑closed mid‑states still render HTML (viainnerHTML) but keeploading=truefor UX/state.
Types
A condensed list of exported types to reference in your code:
CodeBlockNode,GetMarkdownOptions,HeadingNode,ListItemNode,ListNode,MathOptions,ParagraphNode,ParsedNode,ParseOptions, etc.
Use import type { ParsedNode, CodeBlockNode } from 'stream-markdown-parser' in your TypeScript code.
Plugins & Defaults
This package includes a set of parsing helpers and convenience plugins for common flows (for example: footnotes, task checkboxes, sub/sup/mark). Note that emoji handling is no longer enabled by default — consumers who want emoji support should register the emoji plugin explicitly.
You can add custom plugins in several ways:
- Pass plugins to
getMarkdownvia thepluginoption. - Use
applyfunctions ingetMarkdownoptions to mutate the returnedMarkdownItinstance. - When using the
MarkdownRendercomponent, use thecustomMarkdownItprop to receive and mutate theMarkdownItinstance used for that renderer.
Example — enable emoji via the component prop:
<script setup lang="ts">
import type { MarkdownIt } from 'markdown-it-ts'
import { full as markdownItEmoji } from 'markdown-it-emoji'
import MarkdownRender from 'markstream-vue'
function enableEmoji(md: MarkdownIt) {
md.use(markdownItEmoji)
return md
}
</script>
<template>
<MarkdownRender :content="source" :custom-markdown-it="enableEmoji" />
</template>Examples
Use the playground to test custom‑tag parsing quickly. For simple tags like <thinking>, pass customHtmlTags (or custom-html-tags on the component) so the parser emits type: 'thinking' nodes automatically, then map them via setCustomComponents.
For full details and more examples, see packages/markdown-parser/README.md in the repository.
Try this — quick test of a parse option:
import { parseMarkdownToStructure } from 'markstream-vue'
const nodes = parseMarkdownToStructure('Hello **world**')
console.log(nodes)
// Render with <MarkdownRender :nodes="nodes" />Custom components and tag-like elements
Tag-like custom elements (for example <MyWidget ...>...</MyWidget>) produce complex html_block/inline token shapes that are often difficult to reconstruct reliably from the parsed AST using simple regex or string splicing. To reduce maintenance cost and avoid brittle post-processing, we recommend extracting those raw component strings before feeding the remaining content to the Markdown parser and rendering the extracted parts separately as Vue components.
Recommended approach:
- Pre-scan the incoming Markdown content and extract custom-component blocks into a small map keyed by an id (or placeholder).
- Replace the original markup in the Markdown string with a stable placeholder token (for example
[[CUSTOM:1]]). - Let the Markdown parser run on the placeholder-bearing content and then render the AST as usual.
- In your rendering layer, render placeholders by looking up the extracted component string and mounting the corresponding Vue component (or compile it via a lightweight renderer).
Benefits:
- Avoids brittle AST post-processing for nested/tag-like HTML.
- Keeps the Markdown parsing focused on Markdown semantics.
- Allows you to control hydration and scope for custom components separately.
Example (simple sketch):
// 1) extract custom tags
const extracted = new Map<string, string>()
let id = 1
const contentWithPlaceholders = source.replace(/<MyWidget[\s\S]*?<\/MyWidget>/g, (m) => {
const key = `[[CUSTOM:${id++}]]`
extracted.set(key, m)
return key
})
// 2) parse the markdown with placeholders
const nodes = parseMarkdownToStructure(contentWithPlaceholders)
// 3) render: when you encounter a placeholder node, mount your extracted component
// Example pseudo-Vue render logic:
// if (node.type === 'text' && extracted.has(node.content)) {
// return h(CustomWrapper, { raw: extracted.get(node.content) })
// }Thinking blocks and small inline-rich fragments -- If you only need a lightweight rendering for short "thinking" fragments (for example AI assistant thinking traces), the library's MarkdownRenderer (used internally by MarkdownRender) supports stream-friendly rendering hooks and is lighter to reuse than reassembling the AST into component trees yourself. Use parseOptions or preTransform hooks to identify thinking regions and render them with the lighter-weight renderer while keeping complex tag-like custom components extracted and rendered separately.
This hybrid approach minimizes fragile string-manipulation of the Markdown AST while giving you full control over custom component hydration and rendering scope.