The logic for this video timestamp sync to the transcript clock time has been fun to figure out. Uses chapter markers and times in transcript to estimate pre-roll. I’m grabbing audio snippets from YT, run through Whisper to get match to transcript text for final offset calc.