The Reality of Copyright Challenged by Gemini's New Feature: Mass-Producing "Authentic-Sounding Music" in 30 Seconds

The Reality of Copyright Challenged by Gemini's New Feature: Mass-Producing "Authentic-Sounding Music" in 30 Seconds

"Transforming the 'vibe' you envision directly into sound." This experience is finally becoming a standard feature in chat apps.

Google has integrated Google DeepMind's music generation model "Lyria 3" into the Gemini app, allowing users to generate 30-second music tracks from text and images. The aim is not so much the "automatic creation of masterpieces" but rather a light and easily shareable entry point for adding background music to everyday messages and memories.


What's now possible: Instantly generating 30 seconds of "approximate sound"

There are three key points in this update.
The first is that "composition" is completed within the Gemini interface. There's no need to switch to another app; you simply call up the music generation from the tool menu and input a prompt to receive a 30-second track.

The second point is that inputs are not limited to "text only." In addition to specifying "genre," "mood," and "tempo" through text, there is also a pathway to create music based on photos or videos. For example, you can provide a photo of a dog during a hike and create a song with lyrics that match that atmosphere.


The third point is that it includes lyrics and sharing as a "one-package" deal. Lyria 3 can automatically generate lyrics without user input, and the finished song comes with cover art for sharing. Google positions this as a "fun and unique way to express oneself easily."


The service is available for those aged 18 and over and is offered in multiple languages, including Japanese. It will first be available on desktop and then gradually expand to mobile.


Will the "AI-generated feel" disappear? Lyria 3 emphasizes "realism" and "control"

Google highlights the ability to create "more realistic and complex music." Improvements include easier control over elements like style, vocals, and tempo, in addition to automatic lyric generation. In other words, it's not just a "gacha" but rather a move towards aligning more closely with a given image.


However, there is currently a 30-second limitation. On social media, opinions are divided: "It's short, but sufficient for short videos or memes," while others say, "Its brevity might accelerate mass production." In the era of short videos, music often holds value in the first few seconds rather than the full length. The 30-second format seems designed to target precisely that.


Impact on YouTube Shorts: Will pre-made BGM become the norm?

Google is also bringing Lyria 3 to YouTube's Dream Track to support the creation of soundtracks for short videos. If short AI music becomes "the final piece of video editing," it will undoubtedly change creators' production workflows.


Here, the "speed of generation" and "low language barrier" come into play. Trying multiple BGM options to match the tempo of a video is usually time-consuming, but if candidates can be generated just by conveying the mood in a chat, the number of trials will increase even outside professional settings. On social media, there are positive posts from a marketing and prototyping perspective, noting the ability to quickly create rough drafts.


The most contentious issue: Copyright and "What is the training data?"

Whenever AI music is discussed, copyright and training data inevitably come up. Google explicitly states that the aim is "not to mimic existing artists but for original expression," and explains that even if a specific artist's name is input, it will be interpreted as "vibe or mood." They also mention filters to check for similarities with existing content and a contact point for reporting rights violations.


On the other hand, external media and industry narratives point out that "the details of the training sources have not been disclosed." Given the history of lawsuits and conflicts surrounding AI music, how much transparency can be provided will likely influence public perception.


Identifying "AI-created sound": The significance of SynthID and detection features

Another important aspect is "SynthID," which embeds identification information in generated music. Tracks created with Gemini are watermarked, and Gemini will expand its functionality to verify whether the audio was created by Google's AI. This extends detection capabilities from images and videos to audio.


Reactions on social media are divided on this point. Supporters see it as reassuring, saying, "Labeling provides peace of mind," and "At least it prevents 'pretending to be human-made.'" Skeptics, however, question whether the watermark can be bypassed and argue that detection needs to become a general standard to be meaningful. There are also strong concerns that the nature of "30-second mass production" could become a hotbed for streaming fraud and content scams.


Social media reactions: Enthusiasm and apprehension grow simultaneously

A symbolic aspect of this topic is that "it looks fun!" and "it's scary" are both trending at the same time.


Positive side (playfulness, expression, time-saving)

  • The idea of adding BGM to everyday events is intuitive and meme-friendly. As an example, Google demonstrates that even playful themes like "R&B of a sock's love" can be viable.

  • From marketing and planning fields, the advantage of quickly creating rough audio drafts is highlighted, with the value being seen not as a "complete replacement" for professional use but as "prototyping."


Concerns (misuse, copyright, labeling)

  • In Reddit's AI music community, while AI as a creative aid is accepted, there are voices emphasizing that the real issue is "mass-producing and falsely claiming them as human works/profitably deceiving."

  • Industry media express concerns about the lack of transparency in training data, and there's a sentiment that merely advocating "responsible development" is not convincing enough.


Ultimately, the dividing line in reactions boils down to "who is this feature for?" If it's for personal play and expression, it tends to be welcomed. However, the moment it enters monetization on distribution platforms or existing music distribution, issues of rights, labeling, and misuse quickly become "social problems."


What might happen next: Will music shift from a "creation" to a "generated mood"?

The integration of Lyria 3 symbolizes a shift from music being something "created and completed" to something "generated as needed." Short video BGMs, presentation jingles, personal anniversary soundtracks—such "sounds that only need value at the moment of consumption" are well-suited to generative AI.


On the other hand, if discussions on rights and transparency don't catch up, convenience could become fuel for backlash. Google's emphasis on SynthID and detection features likely reflects an awareness of this potential issue.


The 30-second "approximate music" could become either a trivial amusement or a major industrial clash. What Gemini's new feature truly questions may not be the future of music but rather "how much we can rewrite the rules of creation and distribution."



Source URL