Donmai

Alias ai-synthesized speech -> text-to-speech

Posted under Tags

BUR #28838 has been rejected.

create alias ai-synthesized_speech -> text-to-speech

I think TTS is the name with more precedent, calling TTS "AI" is a rather new thing, and is generally used for the more modern, "realistic" methods, yet someone tagged it on post #7456137 which uses... Hatsune Miku.

I disagree with aliasing it to Text-to-Speech. While AI-powered text-to-speech software like Elevenlabs exists, there are also others like so-vits-svc and RVC which are capable of converting speech to speech. This is the same software that was likely used for the video linked using the Hatsune Miku clip as its base audio and a model trained on Aris' voice. It is also the same software used for post #7851616 (which quite literally isn't text to speech since she is singing over the original track).

EDIT: If we really need to alias this I think something like "synthesized speech" without the "AI" would work much better.

Updated

Narehate said:

If we really need to alias this I think something like "synthesized speech" without the "AI" would work much better.

I disagree with both the BUR and this suggestion. The tag was originally called synthesized_speech but I disambiguated it to the current name it has now because "synthesized" speech would include stuff made using Vocaloid, when the tag is clearly meant for speech generated using AI-powered speech-to-speech software. Having the "ai-" there ensures to explicitly note that's what that's for. If anything that isn't that has been tagged with this tag, it should have the tag removed.

Damian0358 said:

I disagree with both the BUR and this suggestion. The tag was originally called synthesized_speech but I disambiguated it to the current name it has now because "synthesized" speech would include stuff made using Vocaloid, when the tag is clearly meant for speech generated using AI-powered speech-to-speech software.

I know what the tag is for. I'm just saying that in the case we really need to alias it to create an umbrella tag for synthesized speech that's one option. But I do agree that it's not a good idea, especially considering how easy it would be to mistag posts with it.

1