Skip to content

Conversation

@ztripez
Copy link
Contributor

@ztripez ztripez commented Apr 27, 2025

Description

This PR introduces a new provider, Music Insights, designed to enhance Music Assistant with features based on audio embeddings and user interaction analysis. It leverages ChromaDB for vector storage and CLAP models (via the transformers library) for generating embeddings.

Current Features (Work-in-Progress):

  • Provider Setup: Basic configuration flow with presets for different hardware capabilities (CPU/GPU).
  • ChromaDB Integration: Sets up a persistent ChromaDB client within the MA data directory.
  • Text Embeddings: Generates text embeddings for tracks based on metadata (genre, artist, title, album, mood).
  • Semantic Search: Allows searching for tracks using natural language queries.
  • Similar Tracks: Finds tracks similar to a given track based on text embedding similarity
  • User Interaction Tracking: Records basic track playback events (start, progress, scrobble) using a dedicated InsightScrobbler. Data is stored in a separate ChromaDB collection.
  • Library Sync: Automatically updates embeddings when tracks are added, updated, or deleted from the library.
  • Configuration Handling: Rebuilds embeddings if relevant configuration (model name, window size) changes.

TODOs:

  • [ ] Audio Embeddings: Currently only text embeddings are generated and used.
  • [ ] Recommendations The core logic to analyze user interactions and generate personalized recommendations based on embeddings needs implementation.

How to Test:

  1. Enable the music_insights provider in the MA settings.
  2. Choose a preset (or configure manually). Note that the first startup might take time to download the embedding model.
  3. Allow the initial embedding process to run (check logs for progress - currently only logs start/finish/errors).
  4. Use the search function with descriptive terms (e.g., "upbeat electronic music", "sad acoustic song").
  5. View a track and check the "Similar Tracks" section.
  6. Play some tracks and observe logs for interaction recording messages (debug level).

This provider is still under active development, but this initial version lays the foundation for music discovery and recommendation features within Music Assistant.

async def async_init(self) -> None:
"""Asynchronously initialize the embedding models."""
# Run blocking model setup in a background task using a thread
self.mass.create_task(asyncio.to_thread(self._setup_models))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

store the task in a variable if you want to cancel it on unload

Comment on lines +155 to +156
# waveform = None
# sample_rate = None
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you need help with this part, ping me on discord. Its relatively easy to get the audio stream in pcm.

@OzGav
Copy link
Contributor

OzGav commented Sep 9, 2025

@ztripez any more progress on this one?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants