SSMD Syntax Reference ===================== This page provides a complete reference for SSMD markup syntax. Text and Emphasis ----------------- SSMD supports all four SSML emphasis levels for fine-grained control over speech emphasis. Moderate Emphasis ~~~~~~~~~~~~~~~~~ Use single asterisks for moderate (default) emphasis: .. code-block:: python ssmd.to_ssml("This is *important*") # → This is important Strong Emphasis ~~~~~~~~~~~~~~~ Use double asterisks for strong emphasis: .. code-block:: python ssmd.to_ssml("This is **very important**") # → This is very important Reduced Emphasis ~~~~~~~~~~~~~~~~ Use single underscores for reduced (subtle) emphasis: .. code-block:: python ssmd.to_ssml("This is _less important_") # → This is less important No Emphasis ~~~~~~~~~~~ Use explicit annotation syntax for no emphasis (rarely used): .. code-block:: python ssmd.to_ssml('[monotone reading]{emphasis="none"}') # → monotone reading .. note:: The "none" emphasis level is rarely needed in practice. It explicitly instructs the TTS engine to speak without any emphasis, which can be useful for robotic or monotone speech effects. Breaks and Pauses ----------------- Time-Based Breaks ~~~~~~~~~~~~~~~~~ Specify duration in milliseconds or seconds using `...` followed by a time value: .. code-block:: python ssmd.to_ssml("Wait ...500ms please") # → Wait please ssmd.to_ssml("Wait ...2s please") # → Wait please .. note:: Bare `...` (without a time or strength code) is NOT treated as a break. It will be preserved as literal ellipsis in your text. Strength-Based Breaks ~~~~~~~~~~~~~~~~~~~~~~ Use strength codes for semantic pauses: .. code-block:: python ssmd.to_ssml("Hello ...n world") # none ssmd.to_ssml("Hello ...w world") # weak (x-weak) ssmd.to_ssml("Hello ...c world") # comma (medium) ssmd.to_ssml("Hello ...s world") # sentence (strong) ssmd.to_ssml("Hello ...p world") # paragraph (x-strong) Strength codes: * ``n`` - none * ``w`` - weak (x-weak) * ``c`` - comma (medium) * ``s`` - sentence (strong) * ``p`` - paragraph (x-strong) Paragraphs ---------- Blank lines separate paragraphs: .. code-block:: python text = """ This is the first paragraph. Still in first paragraph. This is the second paragraph. """ ssmd.to_ssml(text) # → This is the first paragraph. # Still in first paragraph. # This is the second paragraph. Headings -------- Use hash marks for headings (configurable): .. code-block:: python from ssmd import Document text = """ # Main Title Content here. ## Subtitle More content. """ doc = Document(text, config={ 'heading_levels': { 1: [('emphasis', 'strong'), ('pause', '500ms')], 2: [('emphasis', 'moderate')] } }) ssml = doc.to_ssml() Annotations ----------- Annotations use the format ``[text]{key="value"}`` where annotations can be: Language Codes ~~~~~~~~~~~~~~ Specify language with ISO codes: .. code-block:: python # Auto-complete to full locale ssmd.to_ssml('[Bonjour]{lang="fr"}') # → Bonjour # Explicit locale ssmd.to_ssml('[Hello]{lang="en-GB"}') # → Hello Common language codes: * ``en`` → en-US * ``fr`` → fr-FR * ``de`` → de-DE * ``es`` → es-ES * ``it`` → it-IT * ``ja`` → ja-JP * ``zh`` → zh-CN * ``ru`` → ru-RU Voice Selection ~~~~~~~~~~~~~~~ SSMD supports two ways to specify voices: **inline annotations** for short phrases and **block directives** for longer passages (ideal for dialogue and scripts). Inline Voice Annotations ^^^^^^^^^^^^^^^^^^^^^^^^^ Perfect for short voice changes within a sentence: .. code-block:: python # Simple voice name ssmd.to_ssml('[Hello]{voice="Joanna"}') # → Hello # Cloud TTS voice (e.g., Google Wavenet, AWS Polly) ssmd.to_ssml('[Hello]{voice="en-US-Wavenet-A"}') # → Hello # Language and gender attributes ssmd.to_ssml('[Bonjour]{voice-lang="fr-FR" gender="female"}') # → Bonjour # All attributes (language, gender, variant) ssmd.to_ssml('[Text]{voice-lang="en-GB" gender="male" variant="1"}') # → Text Voice attributes: * ``voice="NAME"`` - Voice name (e.g., Joanna, en-US-Wavenet-A) * ``voice-lang="LANG"`` - Language code (e.g., en-GB) * ``gender="GENDER"`` - male, female, or neutral * ``variant="NUMBER"`` - Variant number for tiebreaking Voice Directives (Block Syntax) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Perfect for dialogue, podcasts, and scripts with multiple speakers: .. code-block:: python script = """
Welcome to Tech Talk! I'm Sarah, and today we're diving into the fascinating world of text-to-speech technology. ...s
And I'm Michael! We've got an amazing episode lined up. The advances in neural TTS have been incredible lately. ...s
So what are we covering today?
""" ssmd.to_ssml(script) # Each voice directive creates a separate voice block in SSML Voice directives support all voice attributes: .. code-block:: python # Language and gender multilingual = """
Bonjour! Comment allez-vous aujourd'hui?
Hello there! Lovely weather we're having.
¡Hola! ¿Cómo estás?
""" Voice directive features: * Use ``
`` block syntax * Supports all attributes: language, gender, variant * Applies to all text until the next directive or paragraph break * Automatically detected on SSML→SSMD conversion for long voice blocks * Much more readable than inline annotations for dialogue Mixing inline and directive syntax: .. code-block:: python # Block directive for main speaker, inline for interruptions text = """
Hello everyone, [but wait!]{voice="michael"} Michael interrupts...
Sorry, I had to jump in there!
""" Phonetic Pronunciation ~~~~~~~~~~~~~~~~~~~~~~ IPA (International Phonetic Alphabet) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ .. code-block:: python ssmd.to_ssml('[tomato]{ph="təˈmeɪtoʊ"}') # → tomato ssmd.to_ssml('[hello]{ipa="həˈloʊ"}') # → hello X-SAMPA (Extended Speech Assessment Methods Phonetic Alphabet) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ .. code-block:: python ssmd.to_ssml('[dictionary]{sampa="dIkS@n@ri"}') # → dictionary Substitution (Aliases) ~~~~~~~~~~~~~~~~~~~~~~ Replace text with alternative pronunciation: .. code-block:: python ssmd.to_ssml('[H2O]{sub="water"}') # → H2O ssmd.to_ssml('[AWS]{sub="Amazon Web Services"}') # → AWS ssmd.to_ssml('[NATO]{sub="North Atlantic Treaty Organization"}') Say-As Interpretations ~~~~~~~~~~~~~~~~~~~~~~ Control how text is interpreted: .. code-block:: python # Telephone number ssmd.to_ssml('[+1-555-0123]{as="telephone"}') # Date with format ssmd.to_ssml('[31.12.2024]{as="date" format="dd.mm.yyyy"}') # Say-as with detail attribute (verbosity control) ssmd.to_ssml('[123]{as="cardinal" detail="2"}') # → 123 ssmd.to_ssml('[12/31/2024]{as="date" format="mdy" detail="1"}') # → 12/31/2024 # Spell out characters ssmd.to_ssml('[NASA]{as="character"}') # Number types ssmd.to_ssml('[123]{as="cardinal"}') # one hundred twenty-three ssmd.to_ssml('[1st]{as="ordinal"}') # first ssmd.to_ssml('[123]{as="digits"}') # one two three ssmd.to_ssml('[3.14]{as="fraction"}') # three point one four # Time ssmd.to_ssml('[14:30]{as="time"}') # Expletive (censored/beeped) ssmd.to_ssml('[damn]{as="expletive"}') Supported interpret-as values: * ``character`` - Spell out * ``cardinal`` - Number * ``ordinal`` - First, second, etc. * ``digits`` - Individual digits * ``fraction`` - Decimal numbers * ``unit`` - Measurements * ``date`` - Dates * ``time`` - Time values * ``telephone`` - Phone numbers * ``address`` - Street addresses * ``expletive`` - Censored words The ``detail`` attribute (1-2) controls verbosity level and is platform-specific. Higher values generally provide more detailed pronunciation. Prosody (Voice Control) ------------------------ Use prosody annotations with explicit key/value pairs: .. code-block:: python ssmd.to_ssml('[loud]{volume="loud"}') ssmd.to_ssml('[slow]{rate="slow"}') ssmd.to_ssml('[high]{pitch="high"}') ssmd.to_ssml('[loud and fast]{volume="loud" rate="fast"}') Scale-Based Values (1-5) ~~~~~~~~~~~~~~~~~~~~~~~~ .. code-block:: python ssmd.to_ssml('[extra loud]{volume="5"}') ssmd.to_ssml('[extra fast]{rate="5"}') ssmd.to_ssml('[extra high]{pitch="5"}') Scale mapping: * Volume: 0=silent, 1=x-soft, 2=soft, 3=medium, 4=loud, 5=x-loud * Rate: 1=x-slow, 2=slow, 3=medium, 4=fast, 5=x-fast * Pitch: 1=x-low, 2=low, 3=medium, 4=high, 5=x-high Relative Values ~~~~~~~~~~~~~~~ .. code-block:: python # Decibels for volume ssmd.to_ssml('[louder]{volume="+6dB"}') ssmd.to_ssml('[quieter]{volume="-3dB"}') # Percentages for rate and pitch ssmd.to_ssml('[faster]{rate="+20%"}') ssmd.to_ssml('[slower]{rate="-10%"}') ssmd.to_ssml('[higher]{pitch="+15%"}') ssmd.to_ssml('[lower]{pitch="-5%"}') Audio Files ----------- Basic Audio ~~~~~~~~~~~ .. code-block:: python # With description ssmd.to_ssml('[doorbell]{src="https://example.com/sounds/bell.mp3"}') # → # No description ssmd.to_ssml('[]{src="beep.mp3"}') # → Audio with Fallback ~~~~~~~~~~~~~~~~~~~ .. code-block:: python ssmd.to_ssml('[cat purring]{src="cat.ogg" alt="Sound file not loaded"}') # → The fallback text is spoken if the audio file can't be played. Advanced Audio Attributes ~~~~~~~~~~~~~~~~~~~~~~~~~~ SSMD supports advanced audio control through SSML attributes: Audio Clipping ^^^^^^^^^^^^^^ Play a portion of an audio file by specifying start and end times: .. code-block:: python ssmd.to_ssml('[music]{src="song.mp3" clip="5s-30s"}') # → ssmd.to_ssml('[intro]{src="podcast.mp3" clip="0s-10s"}') # → Speed Control ^^^^^^^^^^^^^ Adjust playback speed using percentages: .. code-block:: python ssmd.to_ssml('[announcement]{src="speech.mp3" speed="150%"}') # → ssmd.to_ssml('[slow]{src="message.mp3" speed="80%"}') # → Repeat Audio ^^^^^^^^^^^^ Repeat audio playback a specific number of times: .. code-block:: python ssmd.to_ssml('[jingle]{src="ad.mp3" repeat="3"}') # → ssmd.to_ssml('[beep]{src="alert.mp3" repeat="5"}') # → Volume Adjustment ^^^^^^^^^^^^^^^^^ Control audio volume using decibel adjustment: .. code-block:: python ssmd.to_ssml('[alarm]{src="alert.mp3" level="+6dB"}') # → ssmd.to_ssml('[background]{src="music.mp3" level="-3dB"}') # → Combining Attributes ^^^^^^^^^^^^^^^^^^^^ Multiple audio attributes can be combined with fallback text: .. code-block:: python ssmd.to_ssml('[bg music]{src="music.mp3" clip="0s-10s" speed="120%" level="-3dB" alt="Fallback text"}') # → ssmd.to_ssml('[effect]{src="sound.mp3" clip="2s-5s" repeat="2" alt="Sound unavailable"}') # → .. note:: Audio attribute support varies by TTS platform. Amazon Polly and Google Cloud TTS support most of these features. Always test with your specific TTS engine. Markers ------- Markers create synchronization points for events: .. code-block:: python ssmd.to_ssml('I always wanted a @animal cat as a pet.') # → I always wanted a cat as a pet. ssmd.to_ssml('Click @here to continue.') # → Click to continue. Markers are removed when stripping to plain text: .. code-block:: python ssmd.to_text('Click @here now') # → Click now Extensions ---------- Platform-specific extensions allow you to use TTS features beyond standard SSML. Amazon Polly Extensions ~~~~~~~~~~~~~~~~~~~~~~~~ Amazon Polly provides effects like whispering and dynamic range compression: .. code-block:: python # Whisper effect ssmd.to_ssml('[secret message]{ext="whisper"}') # → secret message # Dynamic range compression (for voice over music) ssmd.to_ssml('[announcement]{ext="drc"}') # → announcement Google Cloud TTS Speaking Styles ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Google Cloud TTS supports speaking styles for Neural2 and Studio voices. You can configure these using SSMD's extension system: .. code-block:: python from ssmd import Document # Configure Google TTS styles as extensions doc = Document(config={ 'extensions': { 'cheerful': lambda text: f'{text}', 'calm': lambda text: f'{text}', 'empathetic': lambda text: f'{text}', 'apologetic': lambda text: f'{text}', 'firm': lambda text: f'{text}', } }) # Use styles in your content doc.add_sentence("[Welcome to our service!]{ext=\"cheerful\"}") doc.add_sentence("[We apologize for the inconvenience.]{ext=\"apologetic\"}") doc.add_sentence("[Please remain calm.]{ext=\"calm\"}") ssml = doc.to_ssml() # → # Welcome to our service! # We apologize for the inconvenience. # Please remain calm. # Available Google TTS speaking styles: * ``cheerful`` - Upbeat and positive tone * ``calm`` - Relaxed and soothing tone * ``empathetic`` - Understanding and compassionate tone * ``apologetic`` - Sorry and regretful tone * ``firm`` - Confident and authoritative tone * ``news`` - Professional news anchor tone (some voices) * ``conversational`` - Natural conversation tone (some voices) .. note:: Google TTS speaking styles are only supported by specific Neural2 and Studio voices. See the `Google Cloud TTS documentation `_ for voice compatibility. Custom Extensions ~~~~~~~~~~~~~~~~~ You can define your own extensions for any custom SSML tags your TTS platform supports: .. code-block:: python from ssmd import Document doc = Document(config={ 'extensions': { 'robotic': lambda text: f'{text}', 'echo': lambda text: f'{text}', } }) doc.add_sentence("[Hello]{ext=\"robotic\"}") doc.add_sentence("[world]{ext=\"echo\"}") For a complete Google TTS styles example, see ``examples/google_tts_styles.py``. Combining Multiple Annotations ------------------------------- Multiple annotations can be space-separated inside the braces: .. code-block:: python ssmd.to_ssml('[Bonjour]{lang="fr" volume="5" rate="2"}') # → Bonjour ssmd.to_ssml('[important]{volume="5" as="character"}') # → important Escaping -------- XML Special Characters ~~~~~~~~~~~~~~~~~~~~~~~ XML special characters are automatically escaped: .. code-block:: python ssmd.to_ssml('5 < 10 & 10 > 5') # → 5 < 10 & 10 > 5 Security ~~~~~~~~ All user input is automatically sanitized to prevent XML injection attacks. Special characters in both text content and annotation parameters are properly escaped: .. code-block:: python # Malicious input is safely escaped ssmd.to_ssml('[text]{sub="value"}') # → text The library ensures: - **XML validity**: Output is always valid, well-formed XML - **Injection prevention**: User input cannot break out of attribute values or inject tags - **Automatic escaping**: All special characters (``<``, ``>``, ``&``, ``"``, ``'``) are escaped You can safely use SSMD with untrusted user input in TTS applications. Literal Asterisks ~~~~~~~~~~~~~~~~~ To include literal asterisks without emphasis, escape them or use different patterns: .. code-block:: python # These won't be treated as emphasis ssmd.to_ssml('2 * 3 = 6') # → 2 * 3 = 6 ssmd.to_ssml('* list item') # → * list item