Things I Learned Making the YOLOv5 Integration Video
A quick ramble about my experience on video production of machine learning content
Created on May 5|Last edited on May 5
Comment
I've been working for a few weeks on the YOLOv5 and Weights & Biases integration overview video which was recently released. So, I thought I'd share a few personal experience things I've learned in the process of making it.
Combining voice over and talking head style
Most of the videos I've done to date (like on my YouTube channel) were showing things on screen featuring the webcam view of me in the corner trying to explain things.

Kinda like this
However, when trying to be concise and really wanting get the point across in the densest fashion possible, I think, it really does take writing a video script first.
An earlier YOLOv5 & W&B video I made was a scripted voice over. I really liked the style of it, but also wanted to add a little bit more of a personal touch to videos like these.
So, the actual new thing I tried was keeping in mind the parts that would become voice overs, talking heads, and animations while working on the video script in the first place.
And, most importantly, recording all of them separately. It may sounds like overcomplicating things (and to be fair, it sort of is. Ugh, don't get me started on how long the editing process takes, haha).
But it does have upsides too.
For starters, it's really the only way that I found to make these overview-type videos while also having elements of talking into the camera.
Before, I would attempt to talk into the camera from a script (memorize a line, then say it, and repeat) for the whole video. And, it's hard enough to say long sentences about which parameters to pass after the train.py script once, let alone doing so many times on camera.
I also found, though the whole approach is more work, it sort of does make it easier to get started. (Or at least to know what you're supposed to be working on at a particular point) as most tasks are now separated.
It looks something like this: research the topic, run tests, make an outline, write a script, record the voice parts, the talking head parts, the screen captures, edit the video, get feedback, improve the video, and so on.
Splitting complex tasks into smaller parts is a well-known way to avoid that initial resistance when starting to work on something big, so it was super interesting to see the effects of that in my case, even if for a little.
Honorable mentions
- Making an animation explaining Artifacts featuring really sick animal images from the COCO dataset
- Getting quicker and better at asking and applying feedback. (It really does help to have wonderful human beings you work with to show the video drafts too).
- Most important lesson: turns out there is actually a tool in Premiere Pro to select all of the video and audio clips going forward from a specific point. No idea how I got by before!
Shout to Cayla for pointing it out to me!
Conclusion
So, that was a quick ramble on a few new things I learned when making this video. And I am for sure continuing to learn more.
Happy to hear your thoughts/experiences making any sort of content (ML related or not) in the comments section.
And if you wanna learn about object detection with YOLOv5 and Weights & Biases, here's that video I made:
Add a comment