Transform Spotify listening data and lyrics into collage-style visual artwork for fans and artists.
Audivine converts musical identity into visuals. It ingests Spotify top tracks and lyrics, summarizes the narrative with an LLM, and generates collage-style artwork with a fine-tuned Stable Diffusion XL model. For artists, the same pipeline turns lyrics/metadata into marketing-ready visuals that tell the story of a song or release.
Custom collage-style dataset curated from Unsplash; 105 image–caption pairs. Initial BLIP captions refined manually and with GPT-4o-mini.
SDXL (stabilityai/stable-diffusion-xl-base-1.0) fine-tuned via DreamBooth + LoRA; training on Google Colab A100 (~4 hours, ~3000 epochs).
Prompts from lyrics/metadata using GPT-4o-mini; generation via Replicate (fine-tuned SDXL) or Stability AI API; streamed back to the client.
Spotify (top tracks), LyricsOVH (lyrics), WebSocket backend on EC2; negative prompts to avoid violent/undesired content.
Fetch top tracks → fetch lyrics → summarize narratives → generate prompts → create collage-style art.
Use lyrics + album artwork + form inputs to steer prompts toward marketing-ready visuals.
Fine-tuned SDXL yields noticeably more collage-like, distinctive outputs than the naïve baseline.
10 participants rated style difference between models: avg 4.2/5 (higher = more different).
Baseline: ~5.8s/image; Fine-tuned on Replicate: ~11s/image (cold starts + minimal optimization).
Images sourced legally from Unsplash; user listening data is not persisted beyond the session. Generated artwork is intended for personal use or artist promotion, not resale.