TTS Utilities¶
Post-processing utilities for TTS applications. These functions operate on already-normalized text and are not called by normalize_text(). Use them as separate post-processing steps after normalization.
parse_sound_word_field¶
parse_sound_word_field(user_input)
¶
Parse sound word field input into list of (pattern, replacement) tuples.
Parse a user-provided sound word field into a list of (pattern, replacement) tuples.
Parameters:
| Name | Type | Default | Description |
|---|---|---|---|
user_input |
str |
required | Multi-line string. Each line is either "pattern => replacement" or "pattern" (removal) |
Returns: list[tuple[str, str]] — List of (pattern, replacement) tuples. If no => separator is present, the replacement is an empty string (meaning the pattern will be removed entirely).
from revo_norm.tts_utils import parse_sound_word_field
# With replacements
result = parse_sound_word_field("[laughter] => haha\n[applause] => clap")
# [("[laughter]", "haha"), ("[applause]", "clap")]
# With removal (no =>)
result = parse_sound_word_field("[music]\n[breath]")
# [("[music]", ""), ("[breath]", "")]
smart_remove_sound_words¶
smart_remove_sound_words(text, sound_words)
¶
Remove or replace sound words like [laughter], [applause] from text.
Remove or replace sound words (like [laughter], [applause]) from text. Handles possessives ('s), quoted occurrences, and cleans up resulting whitespace and punctuation artifacts.
Parameters:
| Name | Type | Default | Description |
|---|---|---|---|
text |
str |
required | Input text containing sound words |
sound_words |
list[tuple[str, str]] |
required | List of (pattern, replacement) tuples, typically from parse_sound_word_field |
Returns: str — Text with sound words removed or replaced and whitespace cleaned up.
from revo_norm.tts_utils import parse_sound_word_field, smart_remove_sound_words
# Remove sound words
words = parse_sound_word_field("[laughter]\n[applause] => clapping")
text = "Hello [laughter] world [applause] end"
result = smart_remove_sound_words(text, words)
# "Hello world clapping end"
# Removal with possessive handling
words = parse_sound_word_field("[laughter]")
text = "That was [laughter]'s best moment"
result = smart_remove_sound_words(text, words)
# "That was best moment"
add_random_commas¶
add_random_commas(text, min_words=8, max_words=15)
¶
Add random commas for pauses in TTS output.
Smarter logic: - Only add commas after many words (min_words=8, not 2-5) - Don't add comma if there's already punctuation within 8 words - Don't add comma if near end of sentence
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
text
|
str
|
Input text |
required |
min_words
|
int
|
Minimum words before considering a comma (default=8) |
8
|
max_words
|
int
|
Maximum words before forcing a comma (default=15) |
15
|
Insert random commas into long text to create natural pauses for TTS. Uses a smart algorithm that avoids inserting commas near existing punctuation, at sentence boundaries, or too close to sentence endings.
Parameters:
| Name | Type | Default | Description |
|---|---|---|---|
text |
str |
required | Input text |
min_words |
int |
8 |
Minimum number of words before considering a comma |
max_words |
int |
15 |
Maximum number of words before forcing a comma |
Returns: str — Text with optional commas inserted for natural pauses.
from revo_norm.tts_utils import add_random_commas
text = "The quick brown fox jumps over the lazy dog and then runs across the field"
result = add_random_commas(text)
# Comma inserted after a natural pause point (exact position varies due to randomness)
# Short text — no commas inserted
short = "Hello world"
result = add_random_commas(short)
# "Hello world" (unchanged, fewer than min_words=8 words)
split_text_by_words¶
split_text_by_words(text, max_chars=150)
¶
Split text into chunks by word boundaries (respects word integrity).
This function ensures that words are never cut in half. It splits text into chunks that are approximately max_chars long, but will exceed this limit if necessary to avoid cutting a word.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
text
|
str
|
Input text to split |
required |
max_chars
|
int
|
Target maximum characters per chunk (default: 150) |
150
|
Returns:
| Type | Description |
|---|---|
list[str]
|
List of text chunks, each containing complete words |
Example
split_text_by_words("the secret passage under the cottage", 20) ['the secret passage', 'under the cottage'] split_text_by_words("Di halaman rumah nenek", 10) ['Di halaman', 'rumah nenek']
Split text into chunks by word boundaries. Ensures words are never cut in half. Each chunk contains complete words and is approximately max_chars characters long.
Parameters:
| Name | Type | Default | Description |
|---|---|---|---|
text |
str |
required | Input text to split |
max_chars |
int |
150 |
Target maximum characters per chunk |
Returns: list[str] — List of text chunks, each containing complete words.
from revo_norm.tts_utils import split_text_by_words
chunks = split_text_by_words("the secret passage under the cottage", 20)
# ["the secret passage", "under the cottage"]
chunks = split_text_by_words("Di halaman rumah nenek", 10)
# ["Di halaman", "rumah nenek"]
Integration Example¶
These utilities are designed to be used as post-processing steps after normalize_text():
from revo_norm import normalize_text
from revo_norm.tts_utils import (
add_random_commas,
parse_sound_word_field,
smart_remove_sound_words,
split_text_by_words,
)
# Step 1: Normalize
text = "The API costs $5.50 and [laughter] runs on 5km of cable"
normalized = normalize_text(text, language="en")
# Step 2: Remove sound words
sound_words = parse_sound_word_field("[laughter]")
clean = smart_remove_sound_words(normalized, sound_words)
# Step 3: Add pauses for TTS
with_pauses = add_random_commas(clean)
# Step 4: Chunk for TTS synthesis
chunks = split_text_by_words(with_pauses, max_chars=200)