Files
knowledge_base/docs/superpowers/specs/2026-05-09-media-embed-design.md
T
Yutaka Kurosaki def78d4754 Address final review: Vimeo regex boundary + spec accuracy
- Vimeo regex now rejects URLs like vimeo.com/123abc that were
  silently truncated to ID 123 and produced broken iframes. Negative
  lookahead (?![A-Za-z0-9]) ensures the captured digits are not
  followed by alphanumerics. Two false-positive test cases added.
- Spec corrected: HtmlInline nodes ARE filtered regardless of
  insertion path; the implementation uses a dedicated MediaEmbedNode
  + renderer to bypass the filter only for trusted programmatic embeds.
  Components list updated to include the two extra files.
- Plan Task 6 regex updated for consistency.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 11:18:26 +09:00

14 KiB

Media Embed Design

Date: 2026-05-09 Status: Approved Scope: Add support for embedding video files, audio files, YouTube, and Vimeo in Markdown documents using the standard image syntax ![](url).

Background

The knowledge base currently renders Markdown via League\CommonMark with html_input => 'strip', which removes raw HTML. This is a deliberate safety choice: the project is published as OSS and may be deployed in environments with multiple authors or untrusted input, so raw HTML passthrough is undesirable.

To migrate fixed pages from a previous WordPress site (which used <video> tags and YouTube/Vimeo embeds), Markdown needs a safe mechanism to express media embeds. The chosen approach extends the existing image syntax: when an ![](url) URL points to a media resource, the rendered output becomes <video>, <audio>, or <iframe> instead of <img>.

Goals

  • Support embedding local video and audio files via ![](url) syntax
  • Support YouTube and Vimeo embeds via the same syntax
  • Use privacy-enhanced embed modes (youtube-nocookie.com, Vimeo ?dnt=1)
  • Preserve existing image rendering and Wiki link behavior unchanged
  • Maintain html_input => 'strip' for safety
  • Provide unit-test coverage for URL parsing and rendering

Non-Goals

  • Custom attributes (width, autoplay, poster) — sizing handled via CSS only
  • Other embed providers (Twitch, SoundCloud, Spotify, etc.)
  • og:video OGP tags
  • VTT subtitles / <track> elements
  • Download cards for zip/binary files (a separate future task)
  • Rerendering existing documents (a separate Artisan command may be added later)

Architecture

Markdown input
   │
   ▼
CommonMarkParser
   │  (after parse)
   ▼
DocumentParsedEvent ───► MediaEmbedExtension listener
                          │
                          │  Walk Image nodes, classify URL:
                          │   ├─ video extension → <video>
                          │   ├─ audio extension → <audio>
                          │   ├─ YouTube URL    → <iframe> (nocookie)
                          │   ├─ Vimeo URL      → <iframe> (dnt)
                          │   └─ other          → leave unchanged (renders as <img>)
                          │
                          │  Replace matching node with HtmlInline
                          ▼
HTML output (existing render flow unchanged)

The extension lives entirely in CommonMark's event-based AST modification layer. No changes are required to the existing Wiki link, GFM, or image rendering logic.

Boundary Summary

  • Input: Markdown string (unchanged)
  • Output: HTML string — some ![](...) produce <video>, <audio>, or <iframe> instead of <img>
  • Untouched: Wiki links, GFM extension, default image rendering, html_input => 'strip' policy

Components

New files

src/app/Markdown/MediaEmbedExtension.php

CommonMark ExtensionInterface implementation. Sole responsibility: register the listener.

  • Public API: register(EnvironmentBuilderInterface $env): void
  • Wires DocumentParsedEvent to MediaEmbedListener::handle

src/app/Markdown/MediaUrlResolver.php

Pure URL classification class with no external dependencies. Highly testable.

  • Public API: resolve(string $url): ?string
    • Returns the replacement HTML string if URL is a recognized media resource
    • Returns null if URL should fall through to default image rendering
  • Internal helpers:
    • detectVideo(string $url): ?string
    • detectAudio(string $url): ?string
    • detectYouTube(string $url): ?string
    • detectVimeo(string $url): ?string
  • Order: video → audio → YouTube → Vimeo → null

src/app/Markdown/MediaEmbedListener.php

Thin glue layer. Receives DocumentParsedEvent, walks the AST, and delegates URL classification to MediaUrlResolver.

  • Public API: handle(DocumentParsedEvent $event): void
  • For each Image node: call resolver; if non-null, replace node with a MediaEmbedNode

src/app/Markdown/MediaEmbedNode.php

Custom AST node that carries the pre-rendered embed HTML string.

  • Extends AbstractStringContainer
  • Does NOT implement RawMarkupContainerInterface — this is intentional so the node is not subject to HtmlFilter
  • Holds its literal content (the HTML string) for direct output by its renderer

src/app/Markdown/MediaEmbedNodeRenderer.php

Dedicated renderer for MediaEmbedNode.

  • Implements NodeRendererInterface
  • Returns the node's literal content directly, without invoking any HTML filter
  • This is the mechanism that allows trusted embed HTML to survive the html_input => 'strip' policy

Modified files

src/app/Models/Document.phprenderMarkdown()

Add a single line:

$converter->getEnvironment()->addExtension(new \App\Markdown\MediaEmbedExtension());

No other changes.

File-split rationale

Separating MediaUrlResolver from MediaEmbedListener isolates "URL parsing / HTML generation" from "AST manipulation." The former is pure and exhaustively testable; the latter is a thin glue layer. This keeps each unit single-purpose and easier to reason about.

Data Flow Specification

Input → Output reference

Markdown input Output HTML (key parts)
![alt](/foo.png) <img src="/foo.png" alt="alt"> (default, unchanged)
![](/demo.mp4) <video src="/demo.mp4" controls class="kb-video"></video>
![](/audio.mp3) <audio src="/audio.mp3" controls class="kb-audio"></audio>
![](https://youtu.be/abc123XYZ_-) <iframe src="https://www.youtube-nocookie.com/embed/abc123XYZ_-" ...></iframe>
![](https://www.youtube.com/watch?v=abc123XYZ_-&t=30s) <iframe src="https://www.youtube-nocookie.com/embed/abc123XYZ_-?start=30" ...></iframe>
![](https://www.youtube.com/shorts/abc123XYZ_-) <iframe src="https://www.youtube-nocookie.com/embed/abc123XYZ_-" ...></iframe>
![](https://vimeo.com/123456789) <iframe src="https://player.vimeo.com/video/123456789?dnt=1" ...></iframe>
![](https://vimeo.com/123456789#t=30s) <iframe src="https://player.vimeo.com/video/123456789?dnt=1#t=30s" ...></iframe>

Extension matching (case-insensitive)

  • Video: mp4, webm, ogv, mov, m4v
  • Audio: mp3, wav, ogg, m4a

Matching is performed on the URL path only (after stripping ?query and #fragment) so signed CDN URLs with ?token=... are not misclassified.

YouTube URL recognition

The video ID is the strict pattern [A-Za-z0-9_-]{11}. Recognized URL forms:

Pattern Example
youtu.be/{id} https://youtu.be/abc123XYZ_-
youtube.com/watch?v={id} https://www.youtube.com/watch?v=abc123XYZ_-
youtube.com/shorts/{id} https://www.youtube.com/shorts/abc123XYZ_-
youtube.com/embed/{id} https://www.youtube.com/embed/abc123XYZ_-
m.youtube.com/... mobile variant of the above

Timestamp normalization (first match wins; t preferred over start):

  • ?t=30s / ?t=30 / &t=1m20s → seconds → ?start=N
  • ?start=N → preserved
  • No timestamp → no ?start parameter

Vimeo URL recognition

Pattern Example
vimeo.com/{id} https://vimeo.com/123456789
player.vimeo.com/video/{id} https://player.vimeo.com/video/123456789

ID is digits only.

Timestamp:

  • #t=30s → preserved as #t=30s (Vimeo convention)
  • ?t=30s → preserved as #t=30s

iframe attribute template

<iframe src="..."
        width="560" height="315"
        loading="lazy"
        referrerpolicy="strict-origin-when-cross-origin"
        allow="autoplay; encrypted-media; picture-in-picture"
        allowfullscreen
        frameborder="0"
        class="kb-embed kb-embed-{provider}">
</iframe>

{provider} is youtube or vimeo. Class hooks let CSS introduce aspect-ratio control later.

Resolution order

  1. Video extension → emit <video>, return
  2. Audio extension → emit <audio>, return
  3. YouTube → emit <iframe>, return
  4. Vimeo → emit <iframe>, return
  5. None match → return null; node renders as default <img>

Error Handling and Edge Cases

Case Behavior Reason
parse_url failure return null → default <img> Fall back to CommonMark default
URL with no extension return null → default <img> Extension matching is path-suffix based
YouTube ID does not match [A-Za-z0-9_-]{11} return null → default <img> Strict matching avoids false positives
Vimeo ID is not digits return null → default <img> Same
Empty URL return null parse_url returns empty path

Principle: Unrecognized URLs are not transformed. Exceptions are not thrown. Default CommonMark rendering handles the fallback.

XSS hardening

All output URLs are passed through htmlspecialchars($url, ENT_QUOTES, 'UTF-8') before being embedded in HTML strings. Attack-vector analysis:

  • ![](javascript:alert(1)) — does not match a media extension → null → CommonMark's allow_unsafe_links => false blocks <img src="javascript:...">
  • ![](https://youtu.be/"><script>...) — strict ID regex [A-Za-z0-9_-]{11} cannot extract from a URL containing " or >null → default rendering, where CommonMark also escapes
  • ![](/foo.mp4") — trailing quote breaks extension matching at the path-cleaning step; even if it passed, htmlspecialchars would escape the output

Relation to html_input => 'strip'

The 'strip' setting is preserved. All HtmlInline nodes — whether written by the user in Markdown source or inserted programmatically by an extension — go through HtmlFilter::filter(), which strips their content under 'strip' mode. To emit the embed HTML safely without bypassing this policy, the extension introduces a custom node type:

  • MediaEmbedNode extends AbstractStringContainer and deliberately does NOT implement RawMarkupContainerInterface
  • MediaEmbedNodeRenderer returns the node's literal content directly, without invoking any HTML filter

Therefore:

  • User-written <script> in Markdown source → produces HtmlInline → still stripped
  • <video> / <audio> / <iframe> inserted by MediaEmbedExtension → produces MediaEmbedNode → output as intended
  • The security boundary is "only the explicitly trusted node type bypasses filtering," and that node type is reachable only through MediaEmbedListener after MediaUrlResolver has classified the URL as a known media pattern.

alt and title

Markdown image syntax allows ![alt](url "title").

  • <video> / <audio> have no alt attribute → ignored
  • title is preserved on <video> / <audio> as title="..." (optional)
  • iframes ignore both (the YouTube/Vimeo player surfaces its own title)

VTT subtitles / <track> elements are out of scope.

Multiple media in one paragraph

![](/a.mp4) and ![](/b.mp4)

Two <video> elements appear within the same <p>. <video> is phrasing content per the HTML spec, so this is valid. CSS can apply display: block if needed.

Existing documents

Existing rows in documents.rendered_html may be stale after this change. Mitigation is left to the implementation phase — most likely a docs:rerender Artisan command (or a one-off tinker invocation) that re-saves each Document to trigger the existing render hook. This is not part of the design scope and should be tracked separately during implementation planning.

Testing Strategy

tests/Unit/Markdown/MediaUrlResolverTest.php

Pure-unit tests against MediaUrlResolver::resolve.

Video extensions (one case per extension):

  • /demo.mp4, /demo.webm, /demo.ogv, /demo.mov, /demo.m4v<video> output
  • /demo.MP4 (uppercase) → recognized
  • https://example.com/path/demo.mp4?token=abc → query stripped, recognized

Audio extensions (one case per extension):

  • /clip.mp3, /clip.wav, /clip.ogg, /clip.m4a<audio> output

YouTube (full URL pattern coverage):

  • https://youtu.be/dQw4w9WgXcQ
  • https://www.youtube.com/watch?v=dQw4w9WgXcQ
  • https://www.youtube.com/shorts/dQw4w9WgXcQ
  • https://www.youtube.com/embed/dQw4w9WgXcQ
  • https://m.youtube.com/watch?v=dQw4w9WgXcQ
  • Timestamps: ?t=30s, ?t=90, ?t=1m20s, ?start=30
  • Output contains youtube-nocookie.com

Vimeo:

  • https://vimeo.com/123456789
  • https://player.vimeo.com/video/123456789
  • Timestamps: #t=30s, ?t=30s
  • Output contains ?dnt=1

Fallback (returns null):

  • Normal images: /photo.jpg, /icon.svg
  • No extension: /foo
  • Invalid URL: empty string, javascript:alert(1), http://
  • Negative-match candidates: https://example.com/youtu.be-fake/abc (host mismatch)
  • Invalid YouTube ID: https://youtu.be/short (less than 11 chars), special characters

XSS resilience:

  • https://youtu.be/abc"><script>null (strict ID extraction fails)
  • Video URL containing " produces escaped output

tests/Unit/Markdown/MediaEmbedExtensionTest.php

Integrated unit tests through Document::renderMarkdown():

  • Default image survives unchanged: ![alt](/foo.png)<img>
  • Video embed succeeds: ![](/foo.mp4)<video>, no <img>
  • Mixed Markdown: image, video, YouTube, Vimeo coexist correctly
  • Wiki link coexistence: [[other-doc]] is unaffected
  • Multiple media in one paragraph: ![](/a.mp4) ![](/b.mp4) → two <video>
  • List item: - ![](/a.mp4)<video> inside <li>

Test data convention

No fixture files. Test inputs are inline string literals so they remain greppable.

Running

docker compose exec php php artisan test --filter=MediaUrlResolverTest
docker compose exec php php artisan test --filter=MediaEmbedExtensionTest

composer test (full suite) must remain green.

Coverage target

No formal coverage measurement. The bar is: every URL pattern listed in the Data Flow Specification has at least one corresponding test case.

Open Items for Implementation Phase

These are deliberately deferred to the planning phase, not the design:

  • Whether to add a docs:rerender Artisan command for existing rows
  • CSS additions for .kb-video, .kb-audio, .kb-embed-* (likely a future task)
  • Updating CLAUDE.md to document the new media-embed convention