ingest-content
Use when building a media ingestion pipeline — processing raw video/audio/image files, extracting technical metadata, AI-driven scene classification, deduplication by perceptual hash, or organizing footage into an indexed library. Also use when setting up batch processing for large media collections or building a RAG-ready asset database.
| Model | Source |
|---|---|
| sonnet | pack: video |
Full Reference
┏━ 🔧 ingest-content ━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ ┃ Use when building a media ingestion pipeline —… ┃ ┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛
Content Ingestion Pipeline
Section titled “Content Ingestion Pipeline”Systematic pipeline for transforming raw media files into a structured, searchable, AI-enriched library. Five stages: intake → metadata extraction → AI classification → deduplication → indexing. Each stage is independently composable.
Quick Reference
Section titled “Quick Reference”| Item | Value |
|---|---|
| Supported video formats | .mp4, .mov, .avi, .mkv, .webm, .mxf |
| Supported audio formats | .mp3, .wav, .aac, .flac, .m4a |
| Supported image formats | .jpg, .jpeg, .png, .webp, .tiff, .raw |
| Metadata tool | fluent-ffmpeg + @ffprobe-installer/ffprobe |
| Image processing | sharp |
| AI classification model | gemini-2.5-flash via @google/genai |
| Dedup algorithm | dHash (difference hash) + Hamming distance |
| Near-duplicate threshold | Hamming distance < 5 |
| Batch concurrency | p-queue, default 4 |
| Database | Postgres + pgvector (HNSW index for embeddings) |
Reference Index
Section titled “Reference Index”| I want to… | File |
|---|---|
| Scan directories, validate file types, or extract video/image metadata with ffprobe/sharp | intake-and-metadata.md |
| Classify scene type, mood, and content tags from a video keyframe using Gemini Vision | ai-classification.md |
| Compute perceptual hashes, detect near-duplicates, or compare with Hamming distance | deduplication.md |
| Run the full batch pipeline, set up the Postgres schema, or write the library index | pipeline-and-indexing.md |
Usage: Read the reference file matching your current task from the index above. Each file is self-contained with code examples and inline gotchas.