Amuse: Human-AI Collaborative Songwriting with Multimodal Inspirations

Yewon Kim¹, Sung-Ju Lee¹, Chris Donahue²

¹KAIST ²Carnegie Mellon University

We propose Amuse, a songwriting assistant that transforms multimodal (image, text, audio) inputs into chord progressions that can be seamlessly incorporated into songwriters' creative process.

Abstract

Songwriting is often driven by multimodal inspirations, such as imagery, narratives, or existing music, yet songwriters remain unsupported by current music AI systems in incorporating these multimodal inputs into their creative processes. We introduce Amuse, a songwriting assistant that transforms multimodal (image, text, or audio) inputs into chord progressions that can be seamlessly incorporated into songwriters' creative process. A key feature of Amuse is its novel method for generating coherent chords that are relevant to music keywords in the absence of datasets with paired examples of multimodal inputs and chords. Specifically, we propose a method that leverages multimodal LLMs to convert multimodal inputs into noisy chord suggestions and uses a unimodal chord model to filter the suggestions. A user study with songwriters shows that Amuse effectively supports transforming multimodal ideas into coherent musical suggestions, enhancing users' agency and creativity throughout the songwriting process.

Songs Created by the Participants

We conducted a user study with 10 songwriters to evaluate Amuse's effectiveness in supporting songwriting with multimodal inspirations. Participants created 8-bar choruses for two prompts: their favorite summer holiday memory and the beginning of an unexpected friendship. Each participant wrote two songs, one with Amuse's assistance and the other without. The following sound clips showcase the songs created by the participants. For more details, please refer to Sections 7-8 of the paper.

Condition

Assist (use both Amuse and Aria)

Prompt

Write an 8-bar chorus about your favorite summer holiday memory.

Condition

Baseline (use Aria only)

Prompt

Write an 8-bar chorus about the beginning of an unexpected friendship you had in your life.

Condition

Assist (use both Amuse and Aria)

Prompt

Write an 8-bar chorus about the beginning of an unexpected friendship you had in your life.

Condition

Baseline (use Aria only)

Prompt

Write an 8-bar chorus about your favorite summer holiday memory.

Condition

Baseline (use Aria only)

Prompt

Write an 8-bar chorus about the beginning of an unexpected friendship you had in your life.

Condition

Assist (use both Amuse and Aria)

Prompt

Write an 8-bar chorus about your favorite summer holiday memory.

Condition

Baseline (use Aria only)

Prompt

Write an 8-bar chorus about the beginning of an unexpected friendship you had in your life.

Condition

Assist (use both Amuse and Aria)

Prompt

Write an 8-bar chorus about your favorite summer holiday memory.

Condition

Baseline (use Aria only)

Prompt

Write an 8-bar chorus about your favorite summer holiday memory.

Condition

Assist (use both Amuse and Aria)

Prompt

Write an 8-bar chorus about the beginning of an unexpected friendship you had in your life.

Condition

Baseline (use Aria only)

Prompt

Write an 8-bar chorus about your favorite summer holiday memory.

Condition

Assist (use both Amuse and Aria)

Prompt

Write an 8-bar chorus about the beginning of an unexpected friendship you had in your life.

Condition

Assist (use both Amuse and Aria)

Prompt

Write an 8-bar chorus about the beginning of an unexpected friendship you had in your life.

Condition

Baseline (use Aria only)

Prompt

Write an 8-bar chorus about your favorite summer holiday memory.

Condition

Assist (use both Amuse and Aria)

Prompt

Write an 8-bar chorus about your favorite summer holiday memory.

Condition

Baseline (use Aria only)

Prompt

Write an 8-bar chorus about the beginning of an unexpected friendship you had in your life.

Condition

Baseline (use Aria only)

Prompt

Write an 8-bar chorus about the beginning of an unexpected friendship you had in your life.

Condition

Assist (use both Amuse and Aria)

Prompt

Write an 8-bar chorus about your favorite summer holiday memory.

Condition

Assist (use both Amuse and Aria)

Prompt

Write an 8-bar chorus about your favorite summer holiday memory.

Condition

Baseline (use Aria only)

Prompt

Write an 8-bar chorus about the beginning of an unexpected friendship you had in your life.

Listening Study Audio Samples

We conducted a listening study to assess the musical coherence and keyword relevance of chord progressions generated by our rejection sampling-based chord generation method. Participants evaluated pairs of chord progressions, with each pair generated by two of the following methods: LSTM Prior, GPT-4o, and Amuse (Ours). We list the selected audio samples used for the listening study. Further details can be found in Section 6.2 of the paper.

Musical Coherence

	LSTM Prior	GPT-4o	Amuse (Ours)
1
2
3
4
5

Keyword Relevance

Keywords	LSTM Prior	GPT-4o	Amuse (Ours)
energetic, dance pop, disco
acoustic, folk, country
smooth, jazz, swing
bossa nova, latin jazz, samba
emotional, ballad, sad

Citation

@inproceedings{kim2025amuse,
    author = {Kim, Yewon and Lee, Sung-Ju and Donahue, Chris},
    title = {Amuse: Human-AI Collaborative Songwriting with Multimodal Inspirations},
    year = {2025},
    isbn = {9798400713941},
    publisher = {Association for Computing Machinery},
    address = {New York, NY, USA},
    url = {https://doi.org/10.1145/3706598.3713818},
    doi = {10.1145/3706598.3713818},
    booktitle = {Proceedings of the 2025 CHI Conference on Human Factors in Computing Systems},
    articleno = {187},
    numpages = {28},
    keywords = {Creativity Support Tool, Music, Songwriting, Human-AI Interaction, Machine Learning},
    location = {Yokohama, Japan},
    series = {CHI '25}
}