Add design spec for Markdown media embed extension

Approved design for extending image syntax `![](url)` to render videos,
audio, YouTube, and Vimeo embeds. Preserves html_input=>strip safety and
existing image/Wiki-link behavior.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Yutaka Kurosaki
2026-05-09 10:21:10 +09:00
parent 85f67871fa
commit 01a11328ec
@@ -0,0 +1,311 @@
# Media Embed Design
**Date:** 2026-05-09
**Status:** Approved
**Scope:** Add support for embedding video files, audio files, YouTube, and Vimeo in Markdown documents using the standard image syntax `![](url)`.
## Background
The knowledge base currently renders Markdown via `League\CommonMark` with `html_input => 'strip'`, which removes raw HTML. This is a deliberate safety choice: the project is published as OSS and may be deployed in environments with multiple authors or untrusted input, so raw HTML passthrough is undesirable.
To migrate fixed pages from a previous WordPress site (which used `<video>` tags and YouTube/Vimeo embeds), Markdown needs a safe mechanism to express media embeds. The chosen approach extends the existing image syntax: when an `![](url)` URL points to a media resource, the rendered output becomes `<video>`, `<audio>`, or `<iframe>` instead of `<img>`.
## Goals
- Support embedding local video and audio files via `![](url)` syntax
- Support YouTube and Vimeo embeds via the same syntax
- Use privacy-enhanced embed modes (`youtube-nocookie.com`, Vimeo `?dnt=1`)
- Preserve existing image rendering and Wiki link behavior unchanged
- Maintain `html_input => 'strip'` for safety
- Provide unit-test coverage for URL parsing and rendering
## Non-Goals
- Custom attributes (width, autoplay, poster) — sizing handled via CSS only
- Other embed providers (Twitch, SoundCloud, Spotify, etc.)
- `og:video` OGP tags
- VTT subtitles / `<track>` elements
- Download cards for zip/binary files (a separate future task)
- Rerendering existing documents (a separate Artisan command may be added later)
## Architecture
```
Markdown input
CommonMarkParser
│ (after parse)
DocumentParsedEvent ───► MediaEmbedExtension listener
│ Walk Image nodes, classify URL:
│ ├─ video extension → <video>
│ ├─ audio extension → <audio>
│ ├─ YouTube URL → <iframe> (nocookie)
│ ├─ Vimeo URL → <iframe> (dnt)
│ └─ other → leave unchanged (renders as <img>)
│ Replace matching node with HtmlInline
HTML output (existing render flow unchanged)
```
The extension lives entirely in CommonMark's event-based AST modification layer. No changes are required to the existing Wiki link, GFM, or image rendering logic.
### Boundary Summary
- **Input:** Markdown string (unchanged)
- **Output:** HTML string — some `![](...)` produce `<video>`, `<audio>`, or `<iframe>` instead of `<img>`
- **Untouched:** Wiki links, GFM extension, default image rendering, `html_input => 'strip'` policy
## Components
### New files
#### `src/app/Markdown/MediaEmbedExtension.php`
CommonMark `ExtensionInterface` implementation. Sole responsibility: register the listener.
- Public API: `register(EnvironmentBuilderInterface $env): void`
- Wires `DocumentParsedEvent` to `MediaEmbedListener::handle`
#### `src/app/Markdown/MediaUrlResolver.php`
Pure URL classification class with no external dependencies. Highly testable.
- Public API: `resolve(string $url): ?string`
- Returns the replacement HTML string if URL is a recognized media resource
- Returns `null` if URL should fall through to default image rendering
- Internal helpers:
- `detectVideo(string $url): ?string`
- `detectAudio(string $url): ?string`
- `detectYouTube(string $url): ?string`
- `detectVimeo(string $url): ?string`
- Order: video → audio → YouTube → Vimeo → null
#### `src/app/Markdown/MediaEmbedListener.php`
Thin glue layer. Receives `DocumentParsedEvent`, walks the AST, and delegates URL classification to `MediaUrlResolver`.
- Public API: `handle(DocumentParsedEvent $event): void`
- For each `Image` node: call resolver; if non-null, replace node with `HtmlInline`
### Modified files
#### `src/app/Models/Document.php` — `renderMarkdown()`
Add a single line:
```php
$converter->getEnvironment()->addExtension(new \App\Markdown\MediaEmbedExtension());
```
No other changes.
### File-split rationale
Separating `MediaUrlResolver` from `MediaEmbedListener` isolates "URL parsing / HTML generation" from "AST manipulation." The former is pure and exhaustively testable; the latter is a thin glue layer. This keeps each unit single-purpose and easier to reason about.
## Data Flow Specification
### Input → Output reference
| Markdown input | Output HTML (key parts) |
|---|---|
| `![alt](/foo.png)` | `<img src="/foo.png" alt="alt">` *(default, unchanged)* |
| `![](/demo.mp4)` | `<video src="/demo.mp4" controls class="kb-video"></video>` |
| `![](/audio.mp3)` | `<audio src="/audio.mp3" controls class="kb-audio"></audio>` |
| `![](https://youtu.be/abc123XYZ_-)` | `<iframe src="https://www.youtube-nocookie.com/embed/abc123XYZ_-" ...></iframe>` |
| `![](https://www.youtube.com/watch?v=abc123XYZ_-&t=30s)` | `<iframe src="https://www.youtube-nocookie.com/embed/abc123XYZ_-?start=30" ...></iframe>` |
| `![](https://www.youtube.com/shorts/abc123XYZ_-)` | `<iframe src="https://www.youtube-nocookie.com/embed/abc123XYZ_-" ...></iframe>` |
| `![](https://vimeo.com/123456789)` | `<iframe src="https://player.vimeo.com/video/123456789?dnt=1" ...></iframe>` |
| `![](https://vimeo.com/123456789#t=30s)` | `<iframe src="https://player.vimeo.com/video/123456789?dnt=1#t=30s" ...></iframe>` |
### Extension matching (case-insensitive)
- Video: `mp4`, `webm`, `ogv`, `mov`, `m4v`
- Audio: `mp3`, `wav`, `ogg`, `m4a`
Matching is performed on the URL **path** only (after stripping `?query` and `#fragment`) so signed CDN URLs with `?token=...` are not misclassified.
### YouTube URL recognition
The video ID is the strict pattern `[A-Za-z0-9_-]{11}`. Recognized URL forms:
| Pattern | Example |
|---|---|
| `youtu.be/{id}` | `https://youtu.be/abc123XYZ_-` |
| `youtube.com/watch?v={id}` | `https://www.youtube.com/watch?v=abc123XYZ_-` |
| `youtube.com/shorts/{id}` | `https://www.youtube.com/shorts/abc123XYZ_-` |
| `youtube.com/embed/{id}` | `https://www.youtube.com/embed/abc123XYZ_-` |
| `m.youtube.com/...` | mobile variant of the above |
Timestamp normalization (first match wins; `t` preferred over `start`):
- `?t=30s` / `?t=30` / `&t=1m20s` → seconds → `?start=N`
- `?start=N` → preserved
- No timestamp → no `?start` parameter
### Vimeo URL recognition
| Pattern | Example |
|---|---|
| `vimeo.com/{id}` | `https://vimeo.com/123456789` |
| `player.vimeo.com/video/{id}` | `https://player.vimeo.com/video/123456789` |
ID is digits only.
Timestamp:
- `#t=30s` → preserved as `#t=30s` (Vimeo convention)
- `?t=30s` → preserved as `#t=30s`
### iframe attribute template
```html
<iframe src="..."
width="560" height="315"
loading="lazy"
referrerpolicy="strict-origin-when-cross-origin"
allow="autoplay; encrypted-media; picture-in-picture"
allowfullscreen
frameborder="0"
class="kb-embed kb-embed-{provider}">
</iframe>
```
`{provider}` is `youtube` or `vimeo`. Class hooks let CSS introduce aspect-ratio control later.
### Resolution order
1. Video extension → emit `<video>`, return
2. Audio extension → emit `<audio>`, return
3. YouTube → emit `<iframe>`, return
4. Vimeo → emit `<iframe>`, return
5. None match → return `null`; node renders as default `<img>`
## Error Handling and Edge Cases
| Case | Behavior | Reason |
|---|---|---|
| `parse_url` failure | return `null` → default `<img>` | Fall back to CommonMark default |
| URL with no extension | return `null` → default `<img>` | Extension matching is path-suffix based |
| YouTube ID does not match `[A-Za-z0-9_-]{11}` | return `null` → default `<img>` | Strict matching avoids false positives |
| Vimeo ID is not digits | return `null` → default `<img>` | Same |
| Empty URL | return `null` | `parse_url` returns empty path |
**Principle:** Unrecognized URLs are not transformed. Exceptions are not thrown. Default CommonMark rendering handles the fallback.
### XSS hardening
All output URLs are passed through `htmlspecialchars($url, ENT_QUOTES, 'UTF-8')` before being embedded in HTML strings. Attack-vector analysis:
- `![](javascript:alert(1))` — does not match a media extension → `null` → CommonMark's `allow_unsafe_links => false` blocks `<img src="javascript:...">`
- `![](https://youtu.be/"><script>...)` — strict ID regex `[A-Za-z0-9_-]{11}` cannot extract from a URL containing `"` or `>``null` → default rendering, where CommonMark also escapes
- `![](/foo.mp4")` — trailing quote breaks extension matching at the path-cleaning step; even if it passed, `htmlspecialchars` would escape the output
### Relation to `html_input => 'strip'`
The `'strip'` setting is preserved. CommonMark strips raw HTML written by users in source Markdown. However, `HtmlInline` nodes inserted programmatically by a registered extension are not stripped. Therefore:
- User-written `<script>` in Markdown source → still stripped
- `<video>` / `<iframe>` inserted by `MediaEmbedExtension` → output as intended
- The security boundary becomes "the extension is responsible for its own escaping," which is enforced by passing all dynamic URL fragments through `htmlspecialchars`.
### `alt` and `title`
Markdown image syntax allows `![alt](url "title")`.
- `<video>` / `<audio>` have no `alt` attribute → ignored
- `title` is preserved on `<video>` / `<audio>` as `title="..."` (optional)
- iframes ignore both (the YouTube/Vimeo player surfaces its own title)
VTT subtitles / `<track>` elements are out of scope.
### Multiple media in one paragraph
```markdown
![](/a.mp4) and ![](/b.mp4)
```
Two `<video>` elements appear within the same `<p>`. `<video>` is phrasing content per the HTML spec, so this is valid. CSS can apply `display: block` if needed.
### Existing documents
Existing rows in `documents.rendered_html` may be stale after this change. Mitigation is left to the implementation phase — most likely a `docs:rerender` Artisan command (or a one-off `tinker` invocation) that re-saves each `Document` to trigger the existing render hook. This is **not part of the design scope** and should be tracked separately during implementation planning.
## Testing Strategy
### `tests/Unit/Markdown/MediaUrlResolverTest.php`
Pure-unit tests against `MediaUrlResolver::resolve`.
**Video extensions** (one case per extension):
- `/demo.mp4`, `/demo.webm`, `/demo.ogv`, `/demo.mov`, `/demo.m4v``<video>` output
- `/demo.MP4` (uppercase) → recognized
- `https://example.com/path/demo.mp4?token=abc` → query stripped, recognized
**Audio extensions** (one case per extension):
- `/clip.mp3`, `/clip.wav`, `/clip.ogg`, `/clip.m4a``<audio>` output
**YouTube** (full URL pattern coverage):
- `https://youtu.be/dQw4w9WgXcQ`
- `https://www.youtube.com/watch?v=dQw4w9WgXcQ`
- `https://www.youtube.com/shorts/dQw4w9WgXcQ`
- `https://www.youtube.com/embed/dQw4w9WgXcQ`
- `https://m.youtube.com/watch?v=dQw4w9WgXcQ`
- Timestamps: `?t=30s`, `?t=90`, `?t=1m20s`, `?start=30`
- Output contains `youtube-nocookie.com`
**Vimeo:**
- `https://vimeo.com/123456789`
- `https://player.vimeo.com/video/123456789`
- Timestamps: `#t=30s`, `?t=30s`
- Output contains `?dnt=1`
**Fallback (returns `null`):**
- Normal images: `/photo.jpg`, `/icon.svg`
- No extension: `/foo`
- Invalid URL: empty string, `javascript:alert(1)`, `http://`
- Negative-match candidates: `https://example.com/youtu.be-fake/abc` (host mismatch)
- Invalid YouTube ID: `https://youtu.be/short` (less than 11 chars), special characters
**XSS resilience:**
- `https://youtu.be/abc"><script>``null` (strict ID extraction fails)
- Video URL containing `"` produces escaped output
### `tests/Unit/Markdown/MediaEmbedExtensionTest.php`
Integrated unit tests through `Document::renderMarkdown()`:
- Default image survives unchanged: `![alt](/foo.png)``<img>`
- Video embed succeeds: `![](/foo.mp4)``<video>`, no `<img>`
- Mixed Markdown: image, video, YouTube, Vimeo coexist correctly
- Wiki link coexistence: `[[other-doc]]` is unaffected
- Multiple media in one paragraph: `![](/a.mp4) ![](/b.mp4)` → two `<video>`
- List item: `- ![](/a.mp4)``<video>` inside `<li>`
### Test data convention
No fixture files. Test inputs are inline string literals so they remain greppable.
### Running
```bash
docker compose exec php php artisan test --filter=MediaUrlResolverTest
docker compose exec php php artisan test --filter=MediaEmbedExtensionTest
```
`composer test` (full suite) must remain green.
### Coverage target
No formal coverage measurement. The bar is: **every URL pattern listed in the Data Flow Specification has at least one corresponding test case.**
## Open Items for Implementation Phase
These are deliberately deferred to the planning phase, not the design:
- Whether to add a `docs:rerender` Artisan command for existing rows
- CSS additions for `.kb-video`, `.kb-audio`, `.kb-embed-*` (likely a future task)
- Updating `CLAUDE.md` to document the new media-embed convention