Skip to content

TTS Utilities

Post-processing utilities for TTS applications. These functions operate on already-normalized text and are not called by normalize_text(). Use them as separate post-processing steps after normalization.

parse_sound_word_field

parse_sound_word_field(user_input)

Parse sound word field input into list of (pattern, replacement) tuples.

Parse a user-provided sound word field into a list of (pattern, replacement) tuples.

Parameters:

Name Type Default Description
user_input str required Multi-line string. Each line is either "pattern => replacement" or "pattern" (removal)

Returns: list[tuple[str, str]] — List of (pattern, replacement) tuples. If no => separator is present, the replacement is an empty string (meaning the pattern will be removed entirely).

from revo_norm.tts_utils import parse_sound_word_field

# With replacements
result = parse_sound_word_field("[laughter] => haha\n[applause] => clap")
# [("[laughter]", "haha"), ("[applause]", "clap")]

# With removal (no =>)
result = parse_sound_word_field("[music]\n[breath]")
# [("[music]", ""), ("[breath]", "")]

smart_remove_sound_words

smart_remove_sound_words(text, sound_words)

Remove or replace sound words like [laughter], [applause] from text.

Remove or replace sound words (like [laughter], [applause]) from text. Handles possessives ('s), quoted occurrences, and cleans up resulting whitespace and punctuation artifacts.

Parameters:

Name Type Default Description
text str required Input text containing sound words
sound_words list[tuple[str, str]] required List of (pattern, replacement) tuples, typically from parse_sound_word_field

Returns: str — Text with sound words removed or replaced and whitespace cleaned up.

from revo_norm.tts_utils import parse_sound_word_field, smart_remove_sound_words

# Remove sound words
words = parse_sound_word_field("[laughter]\n[applause] => clapping")
text = "Hello [laughter] world [applause] end"
result = smart_remove_sound_words(text, words)
# "Hello world clapping end"

# Removal with possessive handling
words = parse_sound_word_field("[laughter]")
text = "That was [laughter]'s best moment"
result = smart_remove_sound_words(text, words)
# "That was best moment"

add_random_commas

add_random_commas(text, min_words=8, max_words=15)

Add random commas for pauses in TTS output.

Smarter logic: - Only add commas after many words (min_words=8, not 2-5) - Don't add comma if there's already punctuation within 8 words - Don't add comma if near end of sentence

Parameters:

Name Type Description Default
text str

Input text

required
min_words int

Minimum words before considering a comma (default=8)

8
max_words int

Maximum words before forcing a comma (default=15)

15

Insert random commas into long text to create natural pauses for TTS. Uses a smart algorithm that avoids inserting commas near existing punctuation, at sentence boundaries, or too close to sentence endings.

Parameters:

Name Type Default Description
text str required Input text
min_words int 8 Minimum number of words before considering a comma
max_words int 15 Maximum number of words before forcing a comma

Returns: str — Text with optional commas inserted for natural pauses.

from revo_norm.tts_utils import add_random_commas

text = "The quick brown fox jumps over the lazy dog and then runs across the field"
result = add_random_commas(text)
# Comma inserted after a natural pause point (exact position varies due to randomness)

# Short text — no commas inserted
short = "Hello world"
result = add_random_commas(short)
# "Hello world" (unchanged, fewer than min_words=8 words)

split_text_by_words

split_text_by_words(text, max_chars=150)

Split text into chunks by word boundaries (respects word integrity).

This function ensures that words are never cut in half. It splits text into chunks that are approximately max_chars long, but will exceed this limit if necessary to avoid cutting a word.

Parameters:

Name Type Description Default
text str

Input text to split

required
max_chars int

Target maximum characters per chunk (default: 150)

150

Returns:

Type Description
list[str]

List of text chunks, each containing complete words

Example

split_text_by_words("the secret passage under the cottage", 20) ['the secret passage', 'under the cottage'] split_text_by_words("Di halaman rumah nenek", 10) ['Di halaman', 'rumah nenek']

Split text into chunks by word boundaries. Ensures words are never cut in half. Each chunk contains complete words and is approximately max_chars characters long.

Parameters:

Name Type Default Description
text str required Input text to split
max_chars int 150 Target maximum characters per chunk

Returns: list[str] — List of text chunks, each containing complete words.

from revo_norm.tts_utils import split_text_by_words

chunks = split_text_by_words("the secret passage under the cottage", 20)
# ["the secret passage", "under the cottage"]

chunks = split_text_by_words("Di halaman rumah nenek", 10)
# ["Di halaman", "rumah nenek"]

Integration Example

These utilities are designed to be used as post-processing steps after normalize_text():

from revo_norm import normalize_text
from revo_norm.tts_utils import (
    add_random_commas,
    parse_sound_word_field,
    smart_remove_sound_words,
    split_text_by_words,
)

# Step 1: Normalize
text = "The API costs $5.50 and [laughter] runs on 5km of cable"
normalized = normalize_text(text, language="en")

# Step 2: Remove sound words
sound_words = parse_sound_word_field("[laughter]")
clean = smart_remove_sound_words(normalized, sound_words)

# Step 3: Add pauses for TTS
with_pauses = add_random_commas(clean)

# Step 4: Chunk for TTS synthesis
chunks = split_text_by_words(with_pauses, max_chars=200)