**Scope:** Add support for embedding video files, audio files, YouTube, and Vimeo in Markdown documents using the standard image syntax ``.
## Background
The knowledge base currently renders Markdown via `League\CommonMark` with `html_input => 'strip'`, which removes raw HTML. This is a deliberate safety choice: the project is published as OSS and may be deployed in environments with multiple authors or untrusted input, so raw HTML passthrough is undesirable.
To migrate fixed pages from a previous WordPress site (which used `<video>` tags and YouTube/Vimeo embeds), Markdown needs a safe mechanism to express media embeds. The chosen approach extends the existing image syntax: when an `` URL points to a media resource, the rendered output becomes `<video>`, `<audio>`, or `<iframe>` instead of `<img>`.
## Goals
- Support embedding local video and audio files via `` syntax
- Support YouTube and Vimeo embeds via the same syntax
- Use privacy-enhanced embed modes (`youtube-nocookie.com`, Vimeo `?dnt=1`)
- Preserve existing image rendering and Wiki link behavior unchanged
- Maintain `html_input => 'strip'` for safety
- Provide unit-test coverage for URL parsing and rendering
## Non-Goals
- Custom attributes (width, autoplay, poster) — sizing handled via CSS only
- Other embed providers (Twitch, SoundCloud, Spotify, etc.)
-`og:video` OGP tags
- VTT subtitles / `<track>` elements
- Download cards for zip/binary files (a separate future task)
- Rerendering existing documents (a separate Artisan command may be added later)
The extension lives entirely in CommonMark's event-based AST modification layer. No changes are required to the existing Wiki link, GFM, or image rendering logic.
### Boundary Summary
- **Input:** Markdown string (unchanged)
- **Output:** HTML string — some `` produce `<video>`, `<audio>`, or `<iframe>` instead of `<img>`
Separating `MediaUrlResolver` from `MediaEmbedListener` isolates "URL parsing / HTML generation" from "AST manipulation." The former is pure and exhaustively testable; the latter is a thin glue layer. This keeps each unit single-purpose and easier to reason about.
**Principle:** Unrecognized URLs are not transformed. Exceptions are not thrown. Default CommonMark rendering handles the fallback.
### XSS hardening
All output URLs are passed through `htmlspecialchars($url, ENT_QUOTES, 'UTF-8')` before being embedded in HTML strings. Attack-vector analysis:
-`)` — does not match a media extension → `null` → CommonMark's `allow_unsafe_links => false` blocks `<img src="javascript:...">`
-`` — strict ID regex `[A-Za-z0-9_-]{11}` cannot extract from a URL containing `"` or `>` → `null` → default rendering, where CommonMark also escapes
-`` — trailing quote breaks extension matching at the path-cleaning step; even if it passed, `htmlspecialchars` would escape the output
The `'strip'` setting is preserved. All `HtmlInline` nodes — whether written by the user in Markdown source or inserted programmatically by an extension — go through `HtmlFilter::filter()`, which strips their content under `'strip'` mode. To emit the embed HTML safely without bypassing this policy, the extension introduces a custom node type:
-`MediaEmbedNode` extends `AbstractStringContainer` and deliberately does NOT implement `RawMarkupContainerInterface`
-`MediaEmbedNodeRenderer` returns the node's literal content directly, without invoking any HTML filter
- User-written `<script>` in Markdown source → produces `HtmlInline` → still stripped
-`<video>` / `<audio>` / `<iframe>` inserted by `MediaEmbedExtension` → produces `MediaEmbedNode` → output as intended
- The security boundary is "only the explicitly trusted node type bypasses filtering," and that node type is reachable only through `MediaEmbedListener` after `MediaUrlResolver` has classified the URL as a known media pattern.
-`<video>` / `<audio>` have no `alt` attribute → ignored
-`title` is preserved on `<video>` / `<audio>` as `title="..."` (optional)
- iframes ignore both (the YouTube/Vimeo player surfaces its own title)
VTT subtitles / `<track>` elements are out of scope.
### Multiple media in one paragraph
```markdown
 and 
```
Two `<video>` elements appear within the same `<p>`. `<video>` is phrasing content per the HTML spec, so this is valid. CSS can apply `display: block` if needed.
### Existing documents
Existing rows in `documents.rendered_html` may be stale after this change. Mitigation is left to the implementation phase — most likely a `docs:rerender` Artisan command (or a one-off `tinker` invocation) that re-saves each `Document` to trigger the existing render hook. This is **not part of the design scope** and should be tracked separately during implementation planning.