We propose Amuse, a songwriting assistant that transforms multimodal (image, text, audio) inputs into
chord progressions that can be seamlessly incorporated into songwriters' creative process.
Abstract
Songwriting is often driven by multimodal inspirations, such as imagery, narratives, or existing music,
yet songwriters remain unsupported by current music AI systems in incorporating these multimodal inputs into their creative processes.
We introduce Amuse, a songwriting assistant that transforms multimodal (image, text, or audio) inputs into chord progressions that can be seamlessly
incorporated into songwriters' creative process. A key feature of Amuse is its novel method for generating coherent chords that
are relevant to music keywords in the absence of datasets with paired examples of multimodal inputs and chords. Specifically, we
propose a method that leverages multimodal LLMs to convert multimodal inputs into noisy chord suggestions and uses a unimodal
chord model to filter the suggestions. A user study with songwriters shows that Amuse effectively supports transforming multimodal
ideas into coherent musical suggestions, enhancing users' agency and creativity throughout the songwriting process.
Songs Created by the Participants
We conducted a user study with 10 songwriters to evaluate Amuse's effectiveness in supporting songwriting with multimodal inspirations.
Participants created 8-bar choruses for two prompts: their favorite summer holiday memory and the beginning of an unexpected friendship.
Each participant wrote two songs, one with Amuse's assistance and the other without.
The following sound clips showcase the songs created by the participants.
For more details, please refer to Sections 7-8 of the paper.
Condition
Assist (use both Amuse and Aria)
Prompt
Write an 8-bar chorus about your favorite summer holiday memory.
Condition
Baseline (use Aria only)
Prompt
Write an 8-bar chorus about the beginning of an unexpected friendship you had in your life.
Condition
Assist (use both Amuse and Aria)
Prompt
Write an 8-bar chorus about the beginning of an unexpected friendship you had in your life.
Condition
Baseline (use Aria only)
Prompt
Write an 8-bar chorus about your favorite summer holiday memory.
Condition
Baseline (use Aria only)
Prompt
Write an 8-bar chorus about the beginning of an unexpected friendship you had in your life.
Condition
Assist (use both Amuse and Aria)
Prompt
Write an 8-bar chorus about your favorite summer holiday memory.
Condition
Baseline (use Aria only)
Prompt
Write an 8-bar chorus about the beginning of an unexpected friendship you had in your life.
Condition
Assist (use both Amuse and Aria)
Prompt
Write an 8-bar chorus about your favorite summer holiday memory.
Condition
Baseline (use Aria only)
Prompt
Write an 8-bar chorus about your favorite summer holiday memory.
Condition
Assist (use both Amuse and Aria)
Prompt
Write an 8-bar chorus about the beginning of an unexpected friendship you had in your life.
Condition
Baseline (use Aria only)
Prompt
Write an 8-bar chorus about your favorite summer holiday memory.
Condition
Assist (use both Amuse and Aria)
Prompt
Write an 8-bar chorus about the beginning of an unexpected friendship you had in your life.
Condition
Assist (use both Amuse and Aria)
Prompt
Write an 8-bar chorus about the beginning of an unexpected friendship you had in your life.
Condition
Baseline (use Aria only)
Prompt
Write an 8-bar chorus about your favorite summer holiday memory.
Condition
Assist (use both Amuse and Aria)
Prompt
Write an 8-bar chorus about your favorite summer holiday memory.
Condition
Baseline (use Aria only)
Prompt
Write an 8-bar chorus about the beginning of an unexpected friendship you had in your life.
Condition
Baseline (use Aria only)
Prompt
Write an 8-bar chorus about the beginning of an unexpected friendship you had in your life.
Condition
Assist (use both Amuse and Aria)
Prompt
Write an 8-bar chorus about your favorite summer holiday memory.
Condition
Assist (use both Amuse and Aria)
Prompt
Write an 8-bar chorus about your favorite summer holiday memory.
Condition
Baseline (use Aria only)
Prompt
Write an 8-bar chorus about the beginning of an unexpected friendship you had in your life.
Listening Study Audio Samples
We conducted a listening study to assess the musical coherence and keyword relevance of chord progressions generated by our rejection sampling-based chord generation method.
Participants evaluated pairs of chord progressions, with each pair generated by two of the following methods: LSTM Prior, GPT-4o, and Amuse (Ours).
We list the selected audio samples used for the listening study.
Further details can be found in Section 6.2 of the paper.
Musical Coherence
LSTM Prior
GPT-4o
Amuse (Ours)
1
2
3
4
5
Keyword Relevance
Keywords
LSTM Prior
GPT-4o
Amuse (Ours)
energetic, dance pop, disco
acoustic, folk, country
smooth, jazz, swing
bossa nova, latin jazz, samba
emotional, ballad, sad
Citation
@article{kim2024amuse,
title={Amuse: Human-AI Collaborative Songwriting with Multimodal Inspirations},
author={Kim, Yewon and Lee, Sung-Ju and Donahue, Chris},
year={2024},
journal={arXiv preprint arXiv:2412.18940},
}