Make Longer Generative AI Music with Stable Audio 2.0

generative music, AI, Stability AI, music,

With the release of Stable Audio 2.0, Stability AI promises to enable users to compose generative AI music with more than one movement.

  • Stable Audio 2.0 offers users the capability to generate three-minute-long songs.
  • Despite the extended length, the quality of the generated music still leaves much to be desired.

Stability AI has recently released its Stable Audio 2.0, promising users the ability to create full-length generative AI music.

The software allows users to generate three-minute-long songs. This is a significant improvement over its predecessor’s 90-second limit. With this update, the team hopes to cater to users’ desire for more extensive musical compositions that align with the duration of typical radio-friendly songs. Here’s a demonstration of the AI software’s capabilities, courtesy of Stability AI:

Users can upload their own audio samples, customize their projects using prompts, adjust prompt strength, and modify uploaded audio. They can even add sound effects. All through what is considered an intuitive interface. Unlike its competitors, Stability AI has made Stable Audio freely accessible to the public, democratizing AI-generated music creation.

However, the same users to whom the company is catering were left unimpressed by the quality and coherence of the generative music. Even the team knows it. Some describe the vocals as resembling whale sounds, indicating a lack of human-like expression and emotion in the compositions. Stability AI acknowledges that there is still progress to be made in making AI-generated music sound more authentic and soulful.

I agree with the public’s consensus. My prompt was pretty simple, “An eerie song about being lost in thought.” The generated piece was… Do you remember the 2010s fad where people believed musicians were hiding subliminal messages in their songs, but these messages could only be heard by playing the songs backward? Yeah, that’s what it sounded like.

Critics speculate that the root of this issue lies in the training data used for the model. In their blog post, the company wrote that “Like the 1.0 model, 2.0 is trained on data from AudioSparx consisting of over 800,000 audio files containing music, sound effects, and single-instrument stems, as well as corresponding text metadata.”

While Stability AI claims that artists had the option to opt out of having their material used for training, concerns remain regarding the diversity and representativeness of the dataset.

To address copyright concerns, Stability AI has partnered with Audible Magic to implement content recognition technology, ensuring that copyrighted material is not inadvertently included in user-generated content. This partnership underscores Stability AI’s commitment to ethical AI development and creator rights.


Inside Telecom provides you with an extensive list of content covering all aspects of the tech industry. Keep an eye on our Intelligent Tech sections to stay informed and up-to-date with our daily articles.