Parser API

The SSMD Parser provides an alternative to SSML generation by extracting structured data from SSMD text. This is useful when you need programmatic control over SSMD features or want to build custom TTS pipelines.

When to Use the Parser

Use the parser API when you need to:

  • Process SSMD features programmatically - Extract and handle features individually

  • Build custom TTS pipelines - Implement your own text-to-speech workflow

  • Handle text transformations - Process say-as, substitution, and phoneme conversions

  • Create multi-voice dialogue systems - Build voice-specific processing pipelines

  • Analyze SSMD content - Extract metadata and features without generating SSML

Overview

The parser extracts SSMD markup into structured segments, allowing you to process each feature individually instead of generating a complete SSML document.

 from ssmd import parse_paragraphs

 script = """
 <div voice="sarah">
 Hello! Call [+1-555-0123]{as="telephone"} for info.
 </div>

 <div voice="michael">
 Thanks *Sarah*!
 </div>
 """

 # Parse into structured paragraphs
for paragraph in parse_paragraphs(script):
    for sentence in paragraph.sentences:
        # Get voice configuration
        voice_name = sentence.voice.name if sentence.voice else "default"

        # Build complete text from segments
        full_text = ""
        for seg in sentence.segments:
            # Handle text transformations
            if seg.say_as:
                text = convert_say_as(seg.text, seg.say_as.interpret_as)
            elif seg.substitution:
                text = seg.substitution
            elif seg.phoneme:
                text = seg.text  # TTS engine handles phoneme
            else:
                text = seg.text
            full_text += text

        # Speak with TTS engine
        tts.speak(full_text, voice=voice_name)

Parser Functions

parse_paragraphs

Parse SSMD text into structured paragraphs with sentences and segments.

ssmd.parse_paragraphs(text: str, *, capabilities: TTSCapabilities | str | None = None, heading_levels: dict | None = None, extensions: dict | None = None, sentence_detection: bool = True, language: str = 'en', use_spacy: bool | None = None, model_size: str | None = None, parse_yaml_header: bool = False, strict_parse: bool = False) list[Paragraph][source]

Parse SSMD text into a list of Paragraphs.

This is the main parsing function. It handles: - Directive blocks (<div …> … </div>) - Paragraph and sentence splitting - All SSMD markup (emphasis, annotations, breaks, etc.)

Args:

text: SSMD markdown text capabilities: TTS capabilities for filtering (optional) heading_levels: Custom heading configurations extensions: Custom extension handlers sentence_detection: If True, split text into sentences language: Default language for sentence detection use_spacy: If True, use spaCy for sentence detection model_size: spaCy model size (“sm”, “md”, “lg”) parse_yaml_header: If True, parse YAML front matter and apply

heading/extensions config while stripping it from the body. If False, YAML front matter is preserved as plain text.

strict_parse: If True, strip unsupported features based on capabilities.

Returns:

List of Paragraph objects

Returns: List of Paragraph objects.

parse_sentences

Parse SSMD text into structured sentences with segments. This is a convenience wrapper that flattens the paragraphs returned by parse_paragraphs().

ssmd.parse_sentences(ssmd_text: str, *, capabilities: TTSCapabilities | str | None = None, include_default_voice: bool = True, sentence_detection: bool = True, language: str = 'en', model_size: str | None = None, spacy_model: str | None = None, use_spacy: bool | None = None, heading_levels: dict | None = None, extensions: dict | None = None, parse_yaml_header: bool = False, strict_parse: bool = False) list[Sentence][source]

Parse SSMD text into sentences (backward compatible API).

This is an alias for parse_paragraphs() with the old parameter names. Returned sentences include paragraph_index and sentence_index metadata.

Args:

ssmd_text: SSMD formatted text to parse capabilities: TTS capabilities or preset name include_default_voice: If False, exclude sentences without voice context sentence_detection: Enable/disable sentence splitting language: Language code for sentence detection model_size: Size of spacy model (sm/md/lg) spacy_model: Full spacy model name (deprecated, use model_size) use_spacy: Force use of spacy for sentence detection heading_levels: Custom heading configurations extensions: Custom extension handlers parse_yaml_header: If True, parse YAML front matter and apply

heading/extensions config while stripping it from the body. If False, YAML front matter is preserved as plain text.

strict_parse: If True, strip unsupported features based on capabilities.

Returns:

List of Sentence objects

Parameters:

  • ssmd_text (str): SSMD markdown text to parse

  • sentence_detection (bool): Split text into sentences (default: True)

  • include_default_voice (bool): Include text before first voice directive (default: True)

  • capabilities (TTSCapabilities | str): Filter features based on TTS engine support

  • language (str): Language code for sentence detection (default: "en")

  • model_size (str): spaCy model size - "sm", "md", "lg", "trf" (default: "sm")

  • spacy_model (str): Deprecated alias; model size is inferred from the name

  • use_spacy (bool): If False, use fast regex splitting instead of spaCy (default: True)

Returns: List of Sentence objects (alias: SSMDSentence). Each sentence includes paragraph_index and sentence_index metadata.

Example:

from ssmd import parse_sentences

sentences = parse_sentences("Hello *world*! This is great.")

for sent in sentences:
    print(f"Voice: {sent.voice.name if sent.voice else 'default'}")
    print(f"Segments: {len(sent.segments)}")
    for seg in sent.segments:
        print(f"  - {seg.text!r} (emphasis={seg.emphasis})")

Sentence Detection Configuration

Control how sentences are detected and split. SSMD uses phrasplit for intelligent sentence detection with optional spaCy support for maximum accuracy.

Fast Mode (Regex-Based, No spaCy Required)

The default mode uses fast regex-based splitting that works great for well-formatted text:

from ssmd import parse_sentences

# Fast regex splitting (works out-of-the-box, no spaCy needed)
sentences = parse_sentences(
    "Hello world. This is fast.",
    use_spacy=False
)

Auto-Detection (Recommended)

By default, SSMD auto-detects if spaCy is installed and uses it for better accuracy:

# Auto-detect: uses spaCy if installed, falls back to regex
sentences = parse_sentences("Hello. World.")
# Works without spaCy, better accuracy with spaCy

Model Size Selection

When spaCy is installed, choose different model sizes for quality vs. speed tradeoffs:

# Small model (fast, good accuracy) - DEFAULT
sentences = parse_sentences("Hello. World.")
# Uses: en_core_web_sm, fr_core_news_sm, etc.

# Medium model (better accuracy)
sentences = parse_sentences("Hello. World.", model_size="md")
# Uses: en_core_web_md, fr_core_news_md, etc.

# Large model (best accuracy)
sentences = parse_sentences("Hello. World.", model_size="lg")
# Uses: en_core_web_lg, fr_core_news_lg, etc.

# Transformer model (research-grade quality, slowest)
sentences = parse_sentences("Hello. World.", model_size="trf")
# Uses: en_core_web_trf, fr_dep_news_trf, etc.

Deprecated ``spacy_model`` Alias

The spacy_model parameter is retained for backward compatibility and only infers the model size from the name. Prefer model_size for clarity:

sentences = parse_sentences(
    "Technical text here.",
    spacy_model="en_core_web_lg"  # infers model_size="lg"
)

Multi-Language Support

The model_size parameter works across all spaCy-supported languages:

 script = """
 <div voice="fr-FR">
 Bonjour tout le monde!
 </div>

 <div voice="en-US">
 Hello everyone!
 </div>
 """

# Uses fr_core_news_md for French, en_core_web_md for English
sentences = parse_sentences(script, model_size="md")

Installation

SSMD works out-of-the-box with fast regex mode. For spaCy support:

# Install spaCy support
pip install "ssmd[spacy]"

# Install models for your languages
python -m spacy download en_core_web_sm  # English (small)
python -m spacy download en_core_web_md  # English (medium)
python -m spacy download en_core_web_lg  # English (large)
python -m spacy download fr_core_news_sm  # French

See the spaCy models documentation for a complete list of available models.

Performance Comparison

Mode

Speed

Accuracy

Size

Use Case

Regex

60x faster

85-90%

0 MB

Simple text, speed-critical

spaCy sm

Baseline

~95%

~30 MB

Balanced accuracy/performance

spaCy md

Slower

~97%

~100 MB

Better accuracy

spaCy lg

2x slower

~98%

~500 MB

Best accuracy

spaCy trf

10x slower

~99%+

~1 GB

Research, maximum quality

parse_segments

Parse SSMD text into segments without sentence grouping.

ssmd.parse_segments(ssmd_text: str, *, capabilities: TTSCapabilities | str | None = None, voice_context: VoiceAttrs | None = None) list[Segment][source]

Parse SSMD text into segments (backward compatible API).

Parameters:

  • text (str): SSMD text to parse

  • capabilities (TTSCapabilities | str): Filter features based on TTS engine support

  • voice_context (VoiceAttrs | None): Current voice context

Returns: List of Segment objects (alias: SSMDSegment)

Example:

from ssmd import parse_segments

segments = parse_segments('Call [+1-555-0123]{as="telephone"} now')

for seg in segments:
    if seg.say_as:
        print(f"Say-as: {seg.text!r} as {seg.say_as.interpret_as}")

Data Structures

Paragraph (alias: SSMDParagraph)

Represents a paragraph containing sentences.

class ssmd.Paragraph(sentences: list[Sentence] = <factory>)[source]

Bases: object

A paragraph containing sentences.

Paragraphs group sentences separated by blank lines in SSMD.

sentences: list[Sentence]
to_text() str[source]

Convert paragraph to plain text.

Returns:

Plain text with sentence text joined by spaces

property text: str

Plain text representation of the paragraph.

__iter__()[source]

Iterate over sentences.

__len__() int[source]

Return number of sentences in the paragraph.

__str__() str[source]

String representation returns plain text.

Attributes:

  • sentences (list[Sentence]): List of sentences in the paragraph

Sentence (alias: SSMDSentence)

Represents a complete sentence with voice context and segments.

class ssmd.Sentence(segments: list[Segment] = <factory>, voice: VoiceAttrs | None = None, language: str | None = None, prosody: ProsodyAttrs | None = None, is_paragraph_end: bool = False, paragraph_index: int = 0, sentence_index: int = 0, breaks_after: list[BreakAttrs] = <factory>)[source]

Bases: object

A sentence containing segments with directive context.

Represents a logical sentence unit that should be spoken together. Sentences are split on: - Directive changes (<div …> blocks) - Sentence boundaries (.!?) when sentence_detection=True - Paragraph breaks (

)

Attributes:

segments: List of segments in the sentence voice: Voice context for entire sentence (from <div voice=…> directives) language: Language directive for the sentence prosody: Prosody directive for the sentence is_paragraph_end: True if sentence ends with paragraph break paragraph_index: Zero-based paragraph index for this sentence sentence_index: Zero-based sentence index within the document breaks_after: Pauses after the sentence

segments: list[Segment]
voice: VoiceAttrs | None = None
language: str | None = None
prosody: ProsodyAttrs | None = None
is_paragraph_end: bool = False
paragraph_index: int = 0
sentence_index: int = 0
breaks_after: list[BreakAttrs]
to_ssml(capabilities: TTSCapabilities | None = None, extensions: dict | None = None, wrap_sentence: bool = False, warnings: list[str] | None = None) str[source]

Convert sentence to SSML.

Args:

capabilities: TTS engine capabilities for filtering extensions: Custom extension handlers wrap_sentence: If True, wrap content in <s> tag warnings: Optional list to collect warnings

Returns:

SSML string

to_ssmd() str[source]

Convert sentence to SSMD markdown.

Returns:

SSMD string

to_text() str[source]

Convert sentence to plain text.

Returns:

Plain text with all markup removed

property text: str

Get plain text content of the sentence.

Returns:

Plain text string

__str__() str[source]

String representation returns plain text.

__len__() int[source]

Return number of segments.

__iter__()[source]

Iterate over segments.

Attributes:

  • segments (list[Segment]): List of text segments making up the sentence

  • voice (VoiceAttrs | None): Voice configuration for this sentence

  • is_paragraph_end (bool): Whether this sentence ends a paragraph

  • paragraph_index (int): Zero-based paragraph index for this sentence

  • sentence_index (int): Zero-based sentence index within the document

  • breaks_after (list[BreakAttrs]): Breaks after the sentence

Segment (alias: SSMDSegment)

Represents a text segment with associated metadata and features.

class ssmd.Segment(text: str, emphasis: bool | str = False, prosody: ProsodyAttrs | None = None, language: str | None = None, voice: VoiceAttrs | None = None, say_as: SayAsAttrs | None = None, substitution: str | None = None, phoneme: PhonemeAttrs | None = None, audio: AudioAttrs | None = None, extension: str | None = None, breaks_before: list[BreakAttrs] = <factory>, breaks_after: list[BreakAttrs] = <factory>, marks_before: list[str] = <factory>, marks_after: list[str] = <factory>)[source]

Bases: object

A segment of text with SSMD features.

Represents a portion of text with specific formatting and processing attributes. Segments are the atomic units of SSMD content.

Attributes:

text: Raw text content emphasis: Emphasis level (True/”moderate”, “strong”, “reduced”, “none”, False) prosody: Volume, rate, pitch settings language: Language code for this segment voice: Voice settings for this segment say_as: Text interpretation hints substitution: Replacement text (alias) phoneme: IPA pronunciation audio: Audio file to play extension: Platform-specific extension name breaks_before: Pauses before this segment breaks_after: Pauses after this segment marks_before: Event markers before this segment marks_after: Event markers after this segment

text: str
emphasis: bool | str = False
prosody: ProsodyAttrs | None = None
language: str | None = None
voice: VoiceAttrs | None = None
say_as: SayAsAttrs | None = None
substitution: str | None = None
phoneme: PhonemeAttrs | None = None
audio: AudioAttrs | None = None
extension: str | None = None
breaks_before: list[BreakAttrs]
breaks_after: list[BreakAttrs]
marks_before: list[str]
marks_after: list[str]
to_ssml(capabilities: TTSCapabilities | None = None, extensions: dict | None = None, warnings: list[str] | None = None) str[source]

Convert segment to SSML.

Args:

capabilities: TTS engine capabilities for filtering extensions: Custom extension handlers

Returns:

SSML string

to_ssmd() str[source]

Convert segment to SSMD markdown.

Returns:

SSMD string

to_text() str[source]

Convert segment to plain text.

Returns:

Plain text with all markup removed

Attributes:

  • text (str): The text content of this segment

  • emphasis (bool | str): Emphasis level (True, "moderate", "strong", "reduced", "none")

  • prosody (ProsodyAttrs | None): Prosody attributes (volume, rate, pitch)

  • language (str | None): Language code (e.g., "fr-FR")

  • voice (VoiceAttrs | None): Inline voice settings for this segment

  • say_as (SayAsAttrs | None): Say-as interpretation

  • substitution (str | None): Substitution text

  • phoneme (PhonemeAttrs | None): Phonetic pronunciation (with ph and alphabet attributes)

  • audio (AudioAttrs | None): Audio file information

  • extension (str | None): Platform-specific extension name

  • breaks_before (list[BreakAttrs]): Pauses before this segment

  • breaks_after (list[BreakAttrs]): Pauses after this segment

  • marks_before (list[str]): Marker names before this segment

  • marks_after (list[str]): Marker names after this segment

VoiceAttrs

Voice configuration attributes.

class ssmd.VoiceAttrs(name: str | None = None, language: str | None = None, gender: Literal['male', 'female', 'neutral'] | None = None, variant: int | None = None)[source]

Bases: object

Voice attributes for TTS voice selection.

Attributes:

name: Voice name (e.g., “Joanna”, “en-US-Wavenet-A”) language: BCP-47 language code (e.g., “en-US”, “fr-FR”) gender: Voice gender variant: Variant number for disambiguation

name: str | None = None
language: str | None = None
gender: Literal['male', 'female', 'neutral'] | None = None
variant: int | None = None

Attributes:

  • name (str | None): Voice name (e.g., "sarah", "en-US-Wavenet-A")

  • language (str | None): Language code (e.g., "en-US")

  • gender (str | None): Gender ("male", "female", "neutral")

  • variant (int | None): Voice variant number

ProsodyAttrs

Prosody attributes for controlling volume, rate, and pitch.

class ssmd.ProsodyAttrs(volume: str | None = None, rate: str | None = None, pitch: str | None = None)[source]

Bases: object

Prosody attributes for volume, rate, and pitch control.

Attributes:
volume: Volume level (‘silent’, ‘x-soft’, ‘soft’, ‘medium’, ‘loud’,

‘x-loud’, or relative like ‘+10dB’)

rate: Speech rate (‘x-slow’, ‘slow’, ‘medium’, ‘fast’, ‘x-fast’,

or relative like ‘+20%’)

pitch: Pitch level (‘x-low’, ‘low’, ‘medium’, ‘high’, ‘x-high’,

or relative like ‘-5%’)

volume: str | None = None
rate: str | None = None
pitch: str | None = None

Attributes:

  • volume (str | None): Volume level (e.g., "x-loud", "+10dB")

  • rate (str | None): Speech rate (e.g., "fast", "120%")

  • pitch (str | None): Pitch level (e.g., "high", "+20%")

BreakAttrs

Pause/break attributes.

class ssmd.BreakAttrs(time: str | None = None, strength: str | None = None)[source]

Bases: object

Break/pause attributes.

Attributes:

time: Time duration (e.g., ‘500ms’, ‘2s’) strength: Break strength (‘none’, ‘x-weak’, ‘medium’, ‘strong’, ‘x-strong’)

time: str | None = None
strength: str | None = None

Attributes:

  • time (str | None): Break duration (e.g., "500ms", "2s")

  • strength (str | None): Break strength (e.g., "weak", "strong")

SayAsAttrs

Say-as interpretation attributes.

class ssmd.SayAsAttrs(interpret_as: str, format: str | None = None, detail: str | None = None)[source]

Bases: object

Say-as attributes for text interpretation.

Attributes:
interpret_as: Interpretation type (‘telephone’, ‘date’, ‘cardinal’,

‘ordinal’, ‘characters’, ‘expletive’, etc.)

format: Optional format string (e.g., ‘dd.mm.yyyy’ for dates) detail: Optional detail level (e.g., ‘2’ for verbosity)

interpret_as: str
format: str | None = None
detail: str | None = None

Attributes:

  • interpret_as (str): Interpretation type (e.g., "telephone", "date")

  • format (str | None): Format string (e.g., "mdy" for dates)

  • detail (str | None): Verbosity level (platform-specific)

PhonemeAttrs

Phonetic pronunciation attributes.

class ssmd.PhonemeAttrs(ph: str, alphabet: str = 'ipa')[source]

Bases: object

Phoneme pronunciation attributes.

Attributes:

ph: Phonetic pronunciation string alphabet: Phonetic alphabet (ipa or x-sampa)

ph: str
alphabet: str = 'ipa'

Attributes:

  • ph (str): Phonetic pronunciation string

  • alphabet (str): Phonetic alphabet ("ipa" or "x-sampa", default: "ipa")

AudioAttrs

Audio file attributes.

class ssmd.AudioAttrs(src: str, alt_text: str | None = None, clip_begin: str | None = None, clip_end: str | None = None, speed: str | None = None, repeat_count: int | None = None, repeat_dur: str | None = None, sound_level: str | None = None)[source]

Bases: object

Audio file attributes.

Attributes:

src: Audio file URL or path alt_text: Fallback text if audio cannot be played clip_begin: Start time for playback (e.g., “0s”, “500ms”) clip_end: End time for playback (e.g., “10s”, “5000ms”) speed: Playback speed as percentage (e.g., “150%”, “80%”) repeat_count: Number of times to repeat audio repeat_dur: Total duration for repetitions (e.g., “10s”) sound_level: Volume adjustment in dB (e.g., “+6dB”, “-3dB”)

src: str
alt_text: str | None = None
clip_begin: str | None = None
clip_end: str | None = None
speed: str | None = None
repeat_count: int | None = None
repeat_dur: str | None = None
sound_level: str | None = None

Attributes:

  • src (str): Audio file URL or path

  • alt_text (str | None): Alternative text if audio fails to load

  • clip_begin (str | None): Start time for playback (e.g., "5s", "500ms")

  • clip_end (str | None): End time for playback

  • speed (str | None): Playback speed as percentage (e.g., "150%")

  • repeat_count (int | None): Number of times to repeat audio

  • repeat_dur (str | None): Total duration for repetitions

  • sound_level (str | None): Volume adjustment in dB (e.g., "+6dB", "-3dB")

Usage Examples

Basic Parsing

Extract segments from simple text:

from ssmd import parse_segments

text = "Hello *world*! This is ...500ms great."
segments = parse_segments(text)

for seg in segments:
    print(f"Text: {seg.text!r}")
    if seg.emphasis:
        print("  Has emphasis")
    for brk in seg.breaks_after:
        print(f"  Break: {brk.time}")

Text Transformations

Handle say-as, substitution, and phoneme features:

from ssmd import parse_segments

 text = """
 Call [+1-555-0123]{as="telephone"} for info.
 [H2O]{sub="water"} is important.
 Say [tomato]{ipa="təˈmeɪtoʊ"} correctly.
 """

segments = parse_segments(text)

for seg in segments:
    if seg.say_as:
        print(f"Say-as: {seg.text!r} as {seg.say_as.interpret_as}")
    elif seg.substitution:
        print(f"Substitute: {seg.text!r}{seg.substitution!r}")
    elif seg.phoneme:
        print(f"Phoneme: {seg.text!r}{seg.phoneme.ph!r}")

Multi-Voice Dialogue

Process voice blocks separately:

from ssmd import parse_voice_blocks

 script = """
 <div voice="sarah">
 Hello! Call [+1-555-0123]{as="telephone"} for info.
 </div>

 <div voice="michael">
 Thanks *Sarah*!
 </div>
 """




 blocks = parse_voice_blocks(script)

for voice, text in blocks:
    if voice:
        print(f"{voice.name}: {text.strip()}")

Complete TTS Workflow

Build sentences from segments for TTS processing:

from ssmd import parse_sentences

script = """
<div voice="sarah">
Hello! Call [+1-555-0123]{as="telephone"} for info.
</div>

<div voice="michael">
Thanks *Sarah*!
</div>
"""

for sentence in parse_sentences(script):
    # Get voice
    voice_name = sentence.voice.name if sentence.voice else "default"

   # Build complete text
   full_text = ""
   metadata = []

   for seg in sentence.segments:
       # Handle transformations
       if seg.say_as:
           text = convert_say_as(seg.text, seg.say_as.interpret_as)
           metadata.append(f"say-as:{seg.say_as.interpret_as}")
       elif seg.substitution:
           text = seg.substitution
       elif seg.phoneme:
           text = seg.text
           metadata.append(f"phoneme:{seg.phoneme.ph}")
       else:
           text = seg.text

       full_text += text

       # Track emphasis
       if seg.emphasis:
           metadata.append("emphasis")

       # Track breaks
       for brk in seg.breaks_after:
           metadata.append(f"break:{brk.time}")

   # Speak with TTS engine
   print(f"[{voice_name}] {full_text}")
   if metadata:
       print(f"  Metadata: {', '.join(metadata)}")

Advanced Sentence Parsing

Control sentence detection and voice filtering:

from ssmd import parse_sentences

 text = """
 Welcome to the demo.

 This is a new paragraph.

 <div voice="sarah">
 Sarah speaks here.
 </div>
 """

 sentences = parse_sentences(
     text,
     sentence_detection=True,       # Split by sentences
     include_default_voice=True,    # Include text before voice directive
 )

for i, sent in enumerate(sentences, 1):
    voice_name = sent.voice.name if sent.voice else "(default)"
    text_content = "".join(seg.text for seg in sent.segments)
    para_marker = " [PARA_END]" if sent.is_paragraph_end else ""

    print(f"{i}. [{voice_name}] {text_content!r}{para_marker}")

TTS Engine Integration

Example integration with a TTS engine:

from ssmd import parse_sentences

class TTSEngine:
    def speak(self, text: str, voice: str = "default", **kwargs):
        """Speak text with given voice and parameters."""
        print(f"[TTS] Voice: {voice}, Text: {text}")
        # Your TTS implementation here
        pass

def process_ssmd_script(script: str, tts: TTSEngine):
    """Process SSMD script with TTS engine."""
    sentences = parse_sentences(script)

    for sentence in sentences:
        # Configure voice
        voice_config = {}
        if sentence.voice:
            if sentence.voice.name:
                voice_config["voice"] = sentence.voice.name
            if sentence.voice.language:
                voice_config["language"] = sentence.voice.language

        # Build text with transformations
        full_text = ""
        for seg in sentence.segments:
            if seg.say_as:
                # TTS engine handles say-as conversion
                text = handle_say_as(seg.text, seg.say_as)
            elif seg.substitution:
                text = seg.substitution
            elif seg.phoneme:
                text = seg.text  # Use phoneme for pronunciation
            else:
                text = seg.text

            full_text += text

        # Speak with TTS
        tts.speak(full_text, **voice_config)

 # Usage
 script = """
 <div voice="sarah">
 Hello! Today's date is [2024-01-15]{as="date" format="mdy"}.
 </div>

 <div voice="michael">
 Thank you for listening!
 </div>
 """

tts = TTSEngine()
process_ssmd_script(script, tts)

Capability Filtering

Filter features based on TTS engine capabilities:

from ssmd import parse_sentences

# Parse with pyttsx3 capabilities (limited SSML support)
sentences = parse_sentences(
    'Hello *world*! [Bonjour]{lang="fr"} everyone!',
    capabilities='pyttsx3'
)

# Unsupported features (emphasis, language) are filtered out
for sent in sentences:
    for seg in sent.segments:
        # seg.emphasis will be False (pyttsx3 doesn't support it)
        # seg.language will be None (pyttsx3 doesn't support it)
        print(seg.text)

Complete Demo

See examples/parser_demo.py for a comprehensive demonstration:

python examples/parser_demo.py

The demo includes:

  • Basic segment parsing

  • Text transformations (say-as, substitution, phoneme)

  • Voice block handling

  • Complete TTS workflow

  • Prosody and language annotations

  • Advanced sentence parsing

  • Mock TTS integration

See Also