Cinemagram

Cinemantic Trajectories

A cinemagram is a visualization of the frames of a video, grouping frames by visuo-semantic similarity. Here are four different 2-D cinemagrams for the same video (with their 3-D cinemagrams below, showing the same 2-D coordinates with time sorted vertically):

2024-06-18.15-31-01.mp4

How this was made

I use the nomic-embed-vision-1.5 encoder to embed each video frame as a 768-dimensional vector, and compare a few ways of viewing the vectors in 2D. Each 3-D plot uses the same data in the horizontal plane as its corresponding 2-D plot, and introduces time as the z-axis, moving from the bottom to the top.

The plots show three different dimensionality reduction techniques for converting the 768-D vectors into 2-D vectors: PCA, t-SNE, and UMAP. Each of these techniques has their benefits and limitations, but each can identify at least some visuo-semantic structure in the video: PCA roughly finds a cluster representing the credits sequence, t-SNE finds a clear cluster representing duck, and UMAP finds clear clusters representing red guy and the kitchen.

To convert a video into frames (frame_0000.png, frame_0001.png, etc) I used ffmpeg:

ffmpeg -i input.mp4 -vf fps=1 output_dir/frame_%04d.png

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
data		data
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Cinemagram

Cinemantic Trajectories

How this was made

About

Uh oh!

Releases

Packages

Uh oh!

Languages

mcembalest/cinemagram

Folders and files

Latest commit

History

Repository files navigation

Cinemagram

Cinemantic Trajectories

How this was made

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages