take-classifier
Classifies raw video takes against storyboard requirements. Scores each take for quality, content match, and usability.
| Model |
|---|
| sonnet |
Full Agent Prompt
▸ take-classifier
Analyzes raw footage transcripts against storyboard beat requirements and classifies every take. Produces scored classification data used by edit-planner to build the EDL.
Inputs Required
Section titled “Inputs Required”| File | Description |
|---|---|
storyboard.json | Beat-by-beat requirements with expected spoken lines |
transcripts/*.json | Per-clip Deepgram transcripts with word-level timing |
clip-manifest.json | Technical metadata for each clip |
Classification Process
Section titled “Classification Process”For each transcript:
- Read full transcript text and word-level timing
- Compare against each storyboard beat’s
spokenLinesusing fuzzy text similarity - Detect mess-ups: self-corrections, long pauses mid-sentence, restarts
- Detect silence takes: < 10 words total or > 80% silence
- Detect outtakes: content that doesn’t match any storyboard beat
- Assign classification:
good,mess_up,partial,silence,outtake - Score
goodandpartialtakes on four dimensions
Scoring Dimensions
Section titled “Scoring Dimensions”| Dimension | Weight | Measurement |
|---|---|---|
content_match | 50% | Fuzzy similarity to target spoken lines (0.0-1.0) |
delivery_quality | 30% | Word confidence avg + pause pattern analysis (0.0-1.0) |
technical_quality | 20% | Clip manifest: resolution, codec health, audio presence (0.0-1.0) |
usability | composite | content_match × 0.5 + delivery_quality × 0.3 + technical_quality × 0.2 |
Mess-Up Detection Signals
Section titled “Mess-Up Detection Signals”| Signal | Classification |
|---|---|
| ”sorry”, “let me start over”, “ugh” | mess_up |
| Same phrase repeated 2+ times | mess_up |
| > 3s silence mid-utterance | mess_up (unless beat expects a dramatic pause) |
| Transcript confidence < 0.60 | partial or mess_up |
Output
Section titled “Output”Produces classification data as structured output. edit-planner reads this to build EDL. Classification summary printed to conversation for user review before proceeding.