Files
knowledge_base/docs/superpowers/specs/2026-05-09-media-embed-design.md
T
Yutaka Kurosaki 01a11328ec Add design spec for Markdown media embed extension
Approved design for extending image syntax `![](url)` to render videos,
audio, YouTube, and Vimeo embeds. Preserves html_input=>strip safety and
existing image/Wiki-link behavior.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 10:21:10 +09:00

13 KiB

Media Embed Design

Date: 2026-05-09 Status: Approved Scope: Add support for embedding video files, audio files, YouTube, and Vimeo in Markdown documents using the standard image syntax ![](url).

Background

The knowledge base currently renders Markdown via League\CommonMark with html_input => 'strip', which removes raw HTML. This is a deliberate safety choice: the project is published as OSS and may be deployed in environments with multiple authors or untrusted input, so raw HTML passthrough is undesirable.

To migrate fixed pages from a previous WordPress site (which used <video> tags and YouTube/Vimeo embeds), Markdown needs a safe mechanism to express media embeds. The chosen approach extends the existing image syntax: when an ![](url) URL points to a media resource, the rendered output becomes <video>, <audio>, or <iframe> instead of <img>.

Goals

  • Support embedding local video and audio files via ![](url) syntax
  • Support YouTube and Vimeo embeds via the same syntax
  • Use privacy-enhanced embed modes (youtube-nocookie.com, Vimeo ?dnt=1)
  • Preserve existing image rendering and Wiki link behavior unchanged
  • Maintain html_input => 'strip' for safety
  • Provide unit-test coverage for URL parsing and rendering

Non-Goals

  • Custom attributes (width, autoplay, poster) — sizing handled via CSS only
  • Other embed providers (Twitch, SoundCloud, Spotify, etc.)
  • og:video OGP tags
  • VTT subtitles / <track> elements
  • Download cards for zip/binary files (a separate future task)
  • Rerendering existing documents (a separate Artisan command may be added later)

Architecture

Markdown input
   │
   ▼
CommonMarkParser
   │  (after parse)
   ▼
DocumentParsedEvent ───► MediaEmbedExtension listener
                          │
                          │  Walk Image nodes, classify URL:
                          │   ├─ video extension → <video>
                          │   ├─ audio extension → <audio>
                          │   ├─ YouTube URL    → <iframe> (nocookie)
                          │   ├─ Vimeo URL      → <iframe> (dnt)
                          │   └─ other          → leave unchanged (renders as <img>)
                          │
                          │  Replace matching node with HtmlInline
                          ▼
HTML output (existing render flow unchanged)

The extension lives entirely in CommonMark's event-based AST modification layer. No changes are required to the existing Wiki link, GFM, or image rendering logic.

Boundary Summary

  • Input: Markdown string (unchanged)
  • Output: HTML string — some ![](...) produce <video>, <audio>, or <iframe> instead of <img>
  • Untouched: Wiki links, GFM extension, default image rendering, html_input => 'strip' policy

Components

New files

src/app/Markdown/MediaEmbedExtension.php

CommonMark ExtensionInterface implementation. Sole responsibility: register the listener.

  • Public API: register(EnvironmentBuilderInterface $env): void
  • Wires DocumentParsedEvent to MediaEmbedListener::handle

src/app/Markdown/MediaUrlResolver.php

Pure URL classification class with no external dependencies. Highly testable.

  • Public API: resolve(string $url): ?string
    • Returns the replacement HTML string if URL is a recognized media resource
    • Returns null if URL should fall through to default image rendering
  • Internal helpers:
    • detectVideo(string $url): ?string
    • detectAudio(string $url): ?string
    • detectYouTube(string $url): ?string
    • detectVimeo(string $url): ?string
  • Order: video → audio → YouTube → Vimeo → null

src/app/Markdown/MediaEmbedListener.php

Thin glue layer. Receives DocumentParsedEvent, walks the AST, and delegates URL classification to MediaUrlResolver.

  • Public API: handle(DocumentParsedEvent $event): void
  • For each Image node: call resolver; if non-null, replace node with HtmlInline

Modified files

src/app/Models/Document.phprenderMarkdown()

Add a single line:

$converter->getEnvironment()->addExtension(new \App\Markdown\MediaEmbedExtension());

No other changes.

File-split rationale

Separating MediaUrlResolver from MediaEmbedListener isolates "URL parsing / HTML generation" from "AST manipulation." The former is pure and exhaustively testable; the latter is a thin glue layer. This keeps each unit single-purpose and easier to reason about.

Data Flow Specification

Input → Output reference

Markdown input Output HTML (key parts)
![alt](/foo.png) <img src="/foo.png" alt="alt"> (default, unchanged)
![](/demo.mp4) <video src="/demo.mp4" controls class="kb-video"></video>
![](/audio.mp3) <audio src="/audio.mp3" controls class="kb-audio"></audio>
![](https://youtu.be/abc123XYZ_-) <iframe src="https://www.youtube-nocookie.com/embed/abc123XYZ_-" ...></iframe>
![](https://www.youtube.com/watch?v=abc123XYZ_-&t=30s) <iframe src="https://www.youtube-nocookie.com/embed/abc123XYZ_-?start=30" ...></iframe>
![](https://www.youtube.com/shorts/abc123XYZ_-) <iframe src="https://www.youtube-nocookie.com/embed/abc123XYZ_-" ...></iframe>
![](https://vimeo.com/123456789) <iframe src="https://player.vimeo.com/video/123456789?dnt=1" ...></iframe>
![](https://vimeo.com/123456789#t=30s) <iframe src="https://player.vimeo.com/video/123456789?dnt=1#t=30s" ...></iframe>

Extension matching (case-insensitive)

  • Video: mp4, webm, ogv, mov, m4v
  • Audio: mp3, wav, ogg, m4a

Matching is performed on the URL path only (after stripping ?query and #fragment) so signed CDN URLs with ?token=... are not misclassified.

YouTube URL recognition

The video ID is the strict pattern [A-Za-z0-9_-]{11}. Recognized URL forms:

Pattern Example
youtu.be/{id} https://youtu.be/abc123XYZ_-
youtube.com/watch?v={id} https://www.youtube.com/watch?v=abc123XYZ_-
youtube.com/shorts/{id} https://www.youtube.com/shorts/abc123XYZ_-
youtube.com/embed/{id} https://www.youtube.com/embed/abc123XYZ_-
m.youtube.com/... mobile variant of the above

Timestamp normalization (first match wins; t preferred over start):

  • ?t=30s / ?t=30 / &t=1m20s → seconds → ?start=N
  • ?start=N → preserved
  • No timestamp → no ?start parameter

Vimeo URL recognition

Pattern Example
vimeo.com/{id} https://vimeo.com/123456789
player.vimeo.com/video/{id} https://player.vimeo.com/video/123456789

ID is digits only.

Timestamp:

  • #t=30s → preserved as #t=30s (Vimeo convention)
  • ?t=30s → preserved as #t=30s

iframe attribute template

<iframe src="..."
        width="560" height="315"
        loading="lazy"
        referrerpolicy="strict-origin-when-cross-origin"
        allow="autoplay; encrypted-media; picture-in-picture"
        allowfullscreen
        frameborder="0"
        class="kb-embed kb-embed-{provider}">
</iframe>

{provider} is youtube or vimeo. Class hooks let CSS introduce aspect-ratio control later.

Resolution order

  1. Video extension → emit <video>, return
  2. Audio extension → emit <audio>, return
  3. YouTube → emit <iframe>, return
  4. Vimeo → emit <iframe>, return
  5. None match → return null; node renders as default <img>

Error Handling and Edge Cases

Case Behavior Reason
parse_url failure return null → default <img> Fall back to CommonMark default
URL with no extension return null → default <img> Extension matching is path-suffix based
YouTube ID does not match [A-Za-z0-9_-]{11} return null → default <img> Strict matching avoids false positives
Vimeo ID is not digits return null → default <img> Same
Empty URL return null parse_url returns empty path

Principle: Unrecognized URLs are not transformed. Exceptions are not thrown. Default CommonMark rendering handles the fallback.

XSS hardening

All output URLs are passed through htmlspecialchars($url, ENT_QUOTES, 'UTF-8') before being embedded in HTML strings. Attack-vector analysis:

  • ![](javascript:alert(1)) — does not match a media extension → null → CommonMark's allow_unsafe_links => false blocks <img src="javascript:...">
  • ![](https://youtu.be/"><script>...) — strict ID regex [A-Za-z0-9_-]{11} cannot extract from a URL containing " or >null → default rendering, where CommonMark also escapes
  • ![](/foo.mp4") — trailing quote breaks extension matching at the path-cleaning step; even if it passed, htmlspecialchars would escape the output

Relation to html_input => 'strip'

The 'strip' setting is preserved. CommonMark strips raw HTML written by users in source Markdown. However, HtmlInline nodes inserted programmatically by a registered extension are not stripped. Therefore:

  • User-written <script> in Markdown source → still stripped
  • <video> / <iframe> inserted by MediaEmbedExtension → output as intended
  • The security boundary becomes "the extension is responsible for its own escaping," which is enforced by passing all dynamic URL fragments through htmlspecialchars.

alt and title

Markdown image syntax allows ![alt](url "title").

  • <video> / <audio> have no alt attribute → ignored
  • title is preserved on <video> / <audio> as title="..." (optional)
  • iframes ignore both (the YouTube/Vimeo player surfaces its own title)

VTT subtitles / <track> elements are out of scope.

Multiple media in one paragraph

![](/a.mp4) and ![](/b.mp4)

Two <video> elements appear within the same <p>. <video> is phrasing content per the HTML spec, so this is valid. CSS can apply display: block if needed.

Existing documents

Existing rows in documents.rendered_html may be stale after this change. Mitigation is left to the implementation phase — most likely a docs:rerender Artisan command (or a one-off tinker invocation) that re-saves each Document to trigger the existing render hook. This is not part of the design scope and should be tracked separately during implementation planning.

Testing Strategy

tests/Unit/Markdown/MediaUrlResolverTest.php

Pure-unit tests against MediaUrlResolver::resolve.

Video extensions (one case per extension):

  • /demo.mp4, /demo.webm, /demo.ogv, /demo.mov, /demo.m4v<video> output
  • /demo.MP4 (uppercase) → recognized
  • https://example.com/path/demo.mp4?token=abc → query stripped, recognized

Audio extensions (one case per extension):

  • /clip.mp3, /clip.wav, /clip.ogg, /clip.m4a<audio> output

YouTube (full URL pattern coverage):

  • https://youtu.be/dQw4w9WgXcQ
  • https://www.youtube.com/watch?v=dQw4w9WgXcQ
  • https://www.youtube.com/shorts/dQw4w9WgXcQ
  • https://www.youtube.com/embed/dQw4w9WgXcQ
  • https://m.youtube.com/watch?v=dQw4w9WgXcQ
  • Timestamps: ?t=30s, ?t=90, ?t=1m20s, ?start=30
  • Output contains youtube-nocookie.com

Vimeo:

  • https://vimeo.com/123456789
  • https://player.vimeo.com/video/123456789
  • Timestamps: #t=30s, ?t=30s
  • Output contains ?dnt=1

Fallback (returns null):

  • Normal images: /photo.jpg, /icon.svg
  • No extension: /foo
  • Invalid URL: empty string, javascript:alert(1), http://
  • Negative-match candidates: https://example.com/youtu.be-fake/abc (host mismatch)
  • Invalid YouTube ID: https://youtu.be/short (less than 11 chars), special characters

XSS resilience:

  • https://youtu.be/abc"><script>null (strict ID extraction fails)
  • Video URL containing " produces escaped output

tests/Unit/Markdown/MediaEmbedExtensionTest.php

Integrated unit tests through Document::renderMarkdown():

  • Default image survives unchanged: ![alt](/foo.png)<img>
  • Video embed succeeds: ![](/foo.mp4)<video>, no <img>
  • Mixed Markdown: image, video, YouTube, Vimeo coexist correctly
  • Wiki link coexistence: [[other-doc]] is unaffected
  • Multiple media in one paragraph: ![](/a.mp4) ![](/b.mp4) → two <video>
  • List item: - ![](/a.mp4)<video> inside <li>

Test data convention

No fixture files. Test inputs are inline string literals so they remain greppable.

Running

docker compose exec php php artisan test --filter=MediaUrlResolverTest
docker compose exec php php artisan test --filter=MediaEmbedExtensionTest

composer test (full suite) must remain green.

Coverage target

No formal coverage measurement. The bar is: every URL pattern listed in the Data Flow Specification has at least one corresponding test case.

Open Items for Implementation Phase

These are deliberately deferred to the planning phase, not the design:

  • Whether to add a docs:rerender Artisan command for existing rows
  • CSS additions for .kb-video, .kb-audio, .kb-embed-* (likely a future task)
  • Updating CLAUDE.md to document the new media-embed convention