Character
Voice Fingerprinting
Flag character pairs whose dialogue patterns are nearly identical — they "sound" the same.
What It Does
Extracts dialogue attributed to named characters and builds a voice profile for each, measuring:
- Average sentence length
- Contraction rate (e.g., "can't" vs. "cannot")
- Question frequency
- Vocabulary overlap (Jaccard similarity)
Character pairs with near-identical profiles are flagged — they may be indistinguishable to readers.
Why It Matters
Every character should sound different. A teenager and a professor shouldn't use the same sentence length, vocabulary, and speech patterns. Distinct voices let readers identify speakers even without dialogue tags. When two characters "sound" identical, readability and characterization both suffer.
What Gets Flagged
Near-Identical Voice Profiles
Severity: Information
Example (flagged):
Voice fingerprint: Sarah and Marcus have near-identical dialogue patterns (avg length: 8 vs 9, contraction rate: 12% vs 14%, question rate: 20% vs 18%) — consider differentiating their voices
Why: Both characters use similar sentence lengths, contraction rates, and question frequencies. A reader wouldn't be able to tell who's speaking without tags.
How to differentiate:
- Give one character longer, more formal sentences
- Have one character use contractions while the other avoids them
- Let one character ask more questions while the other makes declarations
- Use distinct vocabulary or speech patterns (slang, technical jargon, etc.)
Requirements
- At least 2 characters with 5+ attributed dialogue sentences each
- Character attribution is detected via dialogue tag patterns ("said Sarah", "Marcus asked", etc.)
- Pronouns (he, she, they, etc.) are ignored as character names
Configuration
No configuration options.
Technical Details
- Source:
prose-craft - Scope: Document-level (compares all character pairs)
- Method: Dialogue extraction via regex, profile comparison using weighted distance metric (sentence length, contraction rate, question rate, vocabulary Jaccard index)