AI-generated musical compositions are increasing in length

Stability AI's audio generation model, Stable Audio 2.0, now permits users to upload their own audio samples for transformation using prompts, resulting in AI-generated songs. However, these compositions are not yet poised to claim any music awards.

The initial iteration of Stable Audio, launched in September 2023, provided a maximum duration of 90 seconds for certain paying users, limiting them to brief sound experiments. In contrast, Stable Audio 2.0 extends this capability to a full three-minute duration, akin to typical radio-friendly songs. Uploaded audio materials must be free from copyright restrictions.

Unlike OpenAI's exclusive Voice Engine, which caters to a select user base, Stability AI has made Stable Audio freely accessible to the public via its website and soon through its API.

Stability AI highlights a significant enhancement in Stable Audio 2.0, enabling the creation of songs with structured components such as introductions, progressions, and conclusions.

Upon experimenting with Stable Audio, it's evident that considerable refinement is needed before achieving musical excellence. Despite generating a track resembling elements of Americana, the inclusion of vocals, likened to whale sounds by some, leaves room for improvement and even prompts a whimsical notion of unintentionally summoning an entity.

Users of Stable Audio 2.0 possess greater customization options, including adjustments to prompt strength and modification levels of uploaded audio. Additionally, sound effects like crowd roars or keyboard taps can be integrated into projects.

Despite advancements, AI-generated songs still exhibit a perceived lack of depth and peculiarity. This sentiment echoes reflections from colleagues and observations across similar ventures by companies like Meta and Google, who are striving to address the issue of soulless soundscapes.

Stability AI affirms that Stable Audio draws from data provided by AudioSparx, containing a vast library of over 800,000 audio files. The company maintains that artists associated with AudioSparx had the option to exclude their material from training the model. Notably, concerns over training on copyrighted content prompted the departure of Stability AI's former VP for audio, Ed Newton-Rex. In this iteration, Stability AI collaborates with Audible Magic to implement content recognition technology, ensuring copyrighted material remains barred from the platform.

While Stable Audio 2.0 demonstrates advancements in rendering compositions more akin to conventional songs, there's room for growth. Perhaps future iterations will refine vocal elements to feature more discernible language, inching closer to musical authenticity.

0 Comentarios