SynthV Wiki
SynthV Wiki
Advertisement
This article is still a work in progress

This article is slowly being improved for the benefit of all visitors. Please bear with us while improvements are being made. We apologise for the inconvenience this may cause in the meantime.
(What's Being worked on: About; Recording Scripts; etc.)

(Inspired by VW page of the same name. Studio has no phonetic guide in it, and neither Dreamtonics nor Eclipsed Sounds has published anything about it, so I thought it's better to have this page.)

About[]

(We can talk about the language, the history, and the sheer diversity of the language) The Spanish language has only 5 vowel sounds and 18 consonants.[1] The language also has 29 possible allophones and 841 theorically possible combinations, requiring only 521 to cover more of the 99.99% of the concurrences within the language.[2]

Synthesizer V and the Spanish language[]

(We can talk about the spelling, usage, and modifications of X-SAMPA, and other technical stuffs) Research and development of Spanish language support for Synthesizer V Studio was conducted by Eclipsed Sounds, LLC with guidance from Dreamtonics Co., Ltd..[3][4]

Spanish recording scripts[]

(Yet to be known, the only thing guaranteed is this will only apply to AI voice databases as ā€“unlike Japanese, English, and Mandarin Chineseā€“, there will be no Standard voice databases for this language.)

Notes on Accent[]

Despite the general belief that singers completely lose their accents when they sing, this is not the case in every instance, and an accent is possible to be heard even in singing vocals.

However, the reason many are led to believe this is that there are several methods of training singers to disguise or otherwise hide their natural accentsā€”they may even adopt an accent that isn't their own for singing. Samples include genres such as western or country, black music such as jazz or soul. Singing also uses different muscles to speech, resulting in difference of air pressure and way the throat moves. Genres such as opera are most likely to make an accent appear almost entirely absent thanks to the impact of the opera vibrato.[5][6]

Synthesizer V will capture any form of accent quite easily at times. It depends on the recording method used by the voice provider, type of sound being recorded per sample (accent impact varies per sample and language) and ā€“in the case of Standard voice databasesā€“ overall number of samples that make up the voice database (the more samples, the more chance of it slipping in). (Types of voice database produced: "standard" concatenative by "chanting" the scripts contains almost all possible phoneme transitions in particular pitches and tempos; or AI-based by actually sing, apparently also impact the accent.)

(Also the fact that Synthesizer V is using X-SAMPA for Spanish, albeit modified, is notable for forcing an American accent to the voice provider).

Spanish is a language that can be impacted by accents at times. For example, SAROS, the first Synthesizer V AI voice database to demonstrate Spanish support via Cross-Lingual Synthesis, was noted to sing with a northeast American accent,[7] with a few phonetic traits characteristic of northern Mexican dialects. However, while it is noteworthy that accents can exist in Spanish voice databases, it is not considered currently as problematic as with other languages. Generally, most Spanish speakers seem able to understand Spanish vocals fairly adeptly enough to use them with little problem.

Native accented[]

American-Spanish Accented[]

  • SAROS: Confirmed to be recorded with Spanish training material to enhance Cross-Lingual Singing Synthesis capabilities and is considered to be a bilingual vocalist in English and Spanish according to Eclipsed Sounds' character profile and product description.[8] SAROS's unrevealed voice provider, who grew up speaking Spanish at home, currently speaks Spanish with a northeast American accent.[7]

Latin-American-Spanish Accented[]

  • NOA: His Spanish demo from August 10, 2024 suggests he may have been recorded with Spanish training material to enhance Cross-lingual Singing Synthesis capabilities. There is currently no official information regarding his voice provider Ashe's Spanish proficiency or, in case of being a native Spanish speaker, his accent. The Spanish demo song's tuner Tomo asserted that phoneme editing was used to achieve some phonetic traits characteristic of Rioplatense Spanish (like omitting or fully aspirating syllable-final S, for example), as the original song the demo covered was by a Chilean artist.[9]

European-Spanish Accented[]

Currently, no native Spanish voice databases are known to exist.

Non-native accented[]

Synthesizer V Studio version 1.5.0 introduced the Cross-Lingual Singing Synthesis feature, which allows all AI voice databases to sing in all languages supported by the software, regardless of the language(s) in which they're recorded. Spanish was made officially available as an experimental feature with the release of Synthesizer V Studio version 1.11.0b1.[3][4][10] All voice databases recorded in languages other than Spanish are considered to be non-native accented, due to a deliberate decision to leave subtle accents in place.[11]

Custom Dictionaries[]

A Spanish pronunciation dictionary ā€“using Japanese as the base languageā€“ was originally made available to SOLARIA and ASTERIAN, but this was announced to be discontinued upon the formal release of Synthesizer V Studio's update adding Spanish support.[3]

Phonetic System's Characteristics[]

Vowels[]

Glides[]

Consonants[]

Weak Allophones[]

Lenition, or Weakening, is a kind of sound change that alters the consonants, making them "softer" in some way. Lenition occurs especially often intervocalically (between vowels). In this position, lenition can be seen as a type of assimilation of the consonant to the surrounding vowels, in which features of the consonant that are not present in the surrounding vowels (e.g. obstruction, voicelessness) are gradually eliminated.

In Spanish, lenition has been an important phenomenon since the evolution from Latin, and it continues to affect some consonants, particularly the voiced plosives /b/, /d/, and /g/. Those ones in intervowel context are realized as "softer"-voiced fricative or approximant allophones.

voiced stop ā†’ continuant (fricative) ā†” approximant (spirant)
[b] voiced bilabial plosive ā†’ [Ī²] voiced bilabial fricative ā†” [Ī²Ģž] bilabial approximant
[dĢŖ] voiced dental plosive ā†’ [Ć°] voiced dental fricative ā†” [Ć°Ģž] dental approximant
[g] voiced velar plosive ā†’ [É£] voiced velar fricative ā†” [É£Ė•] velar approximant

The "harsher" plosives generally appear at the beginning of the words, after a nasal consonant like [m] or [n], and after a pause, while their "softer" allophones appear in all the other contexts, especially intervowel.

Like in the case of English's aspirated allophones, both versions can be interchanged without altering the overall word meaning, varying only by the degree of stress and emphasis of the words. The slow speech tends to favor the "harsher" plosives while the fast speech tends to favor their "softer" allophones, as the first one has more pauses and silences that allows a full realization and articulation of the plosives while the later does not.[12]

Rhotic Consonants[]

The Spanish language is one of the few Indo-European languages which has a clear distinction of the rhotics consonants /ɾ/ alveolar tap (the "flapped D" in the American English, known as "ere" in the Spanish) and /r/ alveolar trill (Rolling R, known as "erre" in the Spanish).

The alveolar trill and the alveolar tap are in phonemic contrast word-internally between vowels but are otherwise in complementary distribution. In the Spanish orthography, to distinguish an intervowel alveolar trill, the double R (or 'rr') notation is used while a single intervowel R always is an alveolar tap.

Techniques[]

Phoneme Replacement[]

Spanish shows a notorious contrast at the beginning of the syllable, however at the end of the syllable (coda position), the contrast of some consonants is much less marked, making them prone to assimilation processes or merging. Knowing these ones, it's possible to replace some of the phonemes for the respective allophones, allowing to change the stress and pronunciation without altering the meaning of the word.

Voicing Assimilation[]

Nasal Assimilation[]

In syllable-final position, the nasal consonants are prone to assimilate the place of articulation of the following consonant, even across a word boundary. Knowing this, it's possible to replace a nasal consonant with another one more appropiate for the context of said phoneme.

Examples:
  • For the word Chancho ('Pig'), it may be input as [ch a J][ch o] instead [ch a n][ch o] in the Synthesizer V Studio editor because the /n/ should be palatalized in that context due the influence of the following /tŹƒ/.
  • In the phrase CorazĆ³n Confundido ('Confused Heart'), it's possible to replace the [n] at the end of the first word for its velar counterpart [N] if the context allows the assimilation of the nasal consonant.
    [k o][r a][s o n][k o n][f u n][D i][D o] ā†’ [k o][r a][s o N][k o n][f u n][D i][D o]-->

Realization of the R[]

In coda or syllable-final position, the realization of the Spanish R is neutralized, which means this one can be realized either as flap or trill.

According to Eclipsed Sounds' Spanish Implementation Survey, the /r/ phoneme (realized as [rr]) requires a shaky pitch curve in order to sound correct, which is not always correctly generated in Synthesizer V Studio beta version 1.11.0b1. ES instructs users to add this "vibrato" pitch over the phoneme for the correct sound, or use the AI Retakes function until it sounds correct. ES will be focusing on improving this over time, but it should naturally improve as Synthesizer V Studio receives more Spanish language data in the future.[7]

Phonetic List[]

Symbol Classification IPA's Symbol / Name Sample Notes Related Phonemes
[a] vowel Ƥ open central unrounded vowel padre
[e] vowel eĢž mid front unrounded vowel enero [i] (lowered)
[i] vowel i close front unrounded vowel finca, mĆ­o [I] (glide/non-syllabic)
[o] vowel oĢž mid back rounded vowel foco, oĆ­do [u] (lowered)
[u] vowel u close back rounded vowel musa, dĆŗo [U] (glide/non-syllabic)
[U] semivowel w~uĢÆ voiced labio-velar approximant huevo, buitre, gĆ¼iro, pausa, neutro Used in raising diphthongs (glide+vowel) as well as falling diphthongs (vowel+glide).

[u] (syllabic)

[I] semivowel j~iĢÆ palatal approximant amplio, ciudad, aire, muy Used in raising diphthongs (glide+vowel) as well as falling diphthongs (vowel+glide). [i] (syllabic)
[y] semivowel ŹŽ palatal lateral approximant European & Andean dialects:
llave, pollo
Encountered in European Spanish and some dialects along the Andes mountain range (inland Peru and the Colombian highlands for example).

[ll] (yeĆ­smo)

[I] (delateralized)

[ll] semivowel Ź voiced palatal fricative ayuno;
Most Latin American dialects:
llave, pollo
Also an allophone of /ŹŽ/ in most Latin American Spanish dialects (especially in Chile), due to a linguistic phenomenon called yeĆ­smo. In the Rioplatense Spanish dialect (Argentina and Uruguay), it gets realized instead as either /Ź’/ or /Źƒ/ (zheĆ­smo and sheĆ­smo, respectively). Spanish as spoken in the United States and Puerto Rico may (but not always), due to the influence of American English, affricate this further to /Ź¤/.

[y] (lleĆ­smo)

[sh] (sheĆ­smo)

[b] consonant b voiced bilabial plosive bestia, embuste, vaca, envidia At the beginning of the word or after a pause or after a nasal consonant.

[p] (voiceless)

[B] (lenited)

[B] consonant Ī²~Ī²Ģž bilabial spirant bebĆ©, obtuso, vivir, curva Lenited /b/. In middle of a word, in all the cases where /b/ isn't used. [b] (fortited)
[d] consonant dĢŖ voiced alveolar plosive dedo, cuando, aldaba At the beginning of the word or after a pause or after a nasal consonant or after /l/.

[t] (voiceless)

[D] (lenited)

[D] consonant Ć°~Ć°Ģž dental spirant dedo, arder, admirar Lenited /d/. In middle of a word, in all the cases where /d/ isn't used. [d] (fortited)
[g] consonant É” voiced velar plosive gato, lengua, guerra At the beginning of the word or after a pause or after a nasal consonant.

[k] (voiceless)

[k] consonant k voiceless velar plosive caƱa, quise, kilo [g] (voiced)
[p] consonant p voiceless bilabial plosive perro, apto [b] (voiced)
[t] consonant tĢŖ voiceless dental plosive tuyo, traba [d] (voiced)
[l] consonant l alveolar lateral approximant lana, principal
[rr] consonant r alveolar trill rumbo, carro, honra, alrededor, disruptivo, Azrael At the beginning of the word or after a nasal consonant, /l/, /s/ or /Īø/. Intervowel only if it is specified by a double R. [r] (lenited)
[r] consonant ɾ alveolar tap caro, bravo, Amor eterno [rr] (trilled)
[m] consonant m bilabial nasal mamĆ”, campo, invertir Also an allophone of /n/ in front of labial consonants. [n] (delabialized)
[n] consonant n alveolar nasal nido, sin

Contains various allophones:

/n/ at the beginning of word or after a pause
/ɲ/ or /nŹ²/ before palatals like /ŹŽ/ or /Ź/, or before postalveolars like /Ź§/, /Źƒ/, /Ź’/, or /Ź¤/
/ŋ/ before velars like /x/, /k/, /g/, or /ɣ/
/nĢŖ/ before dentals like /dĢŖ/, /Ć°/, or /tĢŖ/

[m] (labialized)

[N] (velarized)

[J] (palatalized)

[N] consonant ŋ velar nasal Ć”ngulo, palanca;
European and northern Mexican dialects:
manjar
Syllable-final allophone of /n/ in front of velars like /x/, /k/, /g/, or /É£/. [n] (develarized)
[J] consonant ɲ palatal nasal ƱandĆŗ, enyesar Also an allophone of /n/ in front of palatals like /ŹŽ/ or /Ź/, or before postalveolars like /Ź§/, /Źƒ/, /Ź’/, or /Ź¤/. [n] (depalatalized)
[f] consonant f voiceless labiodental fricative fase, cafƩ
[s] consonant s voiceless alveolar silibant casa, xilĆ³fono;
American and Latin American dialects only:
cerro, cima, zumo, paz
[C] (ceseo; dentalized or lisped)
[C] consonant Īø voiceless dental fricative European dialects only:
cerro, cima, zumo, paz
Used only in European Spanish; unused in American or Latin American Spanish in favor of /s/ (seseo). To enable this in Synthesizer V Studio by default, the "Use European pronunciation" checkbox must be ticked on after picking Spanish as the singing language.

[D] (voiced)

[t] (th-stopping)

[f] (th-fronting)

[s] (seseo or th-alveolarization)

[sh] consonant Źƒ voiceless postalveolar sibilant Xela, shopping (English loanword), WĆ”shington Deaffricated variation of /Ź§/ in some dialects. Allophone of /Ź/ and /ŹŽ/ in Rioplatense dialects (especially among the youth in Buenos Aires, Argentina), due to sheĆ­smo. [ch] (affricated)
[ch] consonant Ź§ voiceless postalveolar affricate chancho [t] (deaffricated)
[x] consonant x voiceless velar fricative jamĆ³n, reloj, gĆ©nero, MĆ©xico

References[]

See also[]

Navigation[]


Advertisement