Skip to content

Latest commit

 

History

History
21 lines (15 loc) · 1.2 KB

README.md

File metadata and controls

21 lines (15 loc) · 1.2 KB

Group members

  • Jaykrishnan Gopalakrishna Pillai
  • Filip Danielsson
  • Filip Koňařík

Goal

Using MLOps procedures to develop a model to generate a sequence of frame of someone playing the piano.

Frameworks

For the piano image generation we will test different generative models such as denoising diffusion or GANs (generative adversarial networks). An example of using diffusion models for conditional image generation: https://github.com/TeaPearce/Conditional_Diffusion_MNIST

Data

We will extract training data from publicly available videos using https://github.com/uel/BP. The generative model can be conditioned on different types of data extracted by the library for example hand placement, played notes, key location, previous frame.

Models

No existing models for this specific task exist at the moment, we will therefore train a generative model from scratch. The data extraction step uses deep learning models for hand landmarking, keyboard detection and piano transcription.

Code coverage

Code coverage: 51%