Directors on Directors: a RAG over Variety's interview series

Variety’s Directors on Directors is a long-running series of candid conversations between pairs of filmmakers about craft: how they work with actors, find a film’s rhythm, handle a scene that isn’t landing. I wanted to ask across all of them at once (“what do directors say about the first day of a shoot?”) instead of watching dozens of interviews to find the through-lines.

So I built a retrieval-augmented generation (RAG) pipeline over the transcripts.

How it works

Conversations from 49 directors are chunked and embedded into a ChromaDB vector store, with Wikipedia added for the directors’ bios and filmographies
LangChain orchestrates retrieval, and Gemini generates grounded answers with citations back to who said what
A RAGAS eval harness scores retrieval and answer quality, so I can tune chunking, embeddings, and prompts against something measurable instead of vibes

How well it works

On the current eval set, answers score 0.98 on faithfulness: they stay grounded in what the directors actually said instead of hallucinating. It’s weaker on comparison questions (“how do X and Y differ on directing actors?”), where retrieval has to pull the right passages from two different conversations at once, and context recall drops. Knowing exactly where it breaks is the point of having the evals. The next lever is improving retrieval for those multi-source questions.

Why I built it

RAG is the building block I reach for most as a PM thinking about AI products, and the best way to understand its sharp edges (chunking, retrieval quality, grounding, evaluation) is to build a real one end to end. Doing it over something I actually care about, how filmmakers think, kept me honest about whether the answers were any good.

Find the GitHub repo here.