Hugging Face FineVideo

FineVideo dataset is a collection of over 43.000 YouTube videos, licensed under Creative Commons licenses.

From their description:

This dataset opens up new frontiers in video understanding, with special focus on the tricky tasks of mood analysis, storytelling and media edition in multimodal settings.
It’s packed with detailed notes on scenes, characters, plot twists, and how audio and visuals play together, making it a versatile tool for everything from beefing up pre-trained models to fine-tuning AI for specific video tasks.
What sets this dataset apart is its focus on capturing the emotional journey and narrative flow of videos - areas where current multimodal datasets fall short - giving researchers the ingredients to cook up more context-savvy video analysis models.

Really exciting!
Check it out here: