Pick a demo to auto-fill input, set domain, run Compare, and see why semantic TM wins over fuzzy matching.
Input Text
0.50.6
Results
Alternative Translations
TM Retrieval Comparison
Fuzzy = string similarity; Basic = semantic k‑NN; Advanced = semantic + factors (domain/length/context/terminology). Domain steers polysemy: e.g., “bank” in nature → river bank (breh rieky). Use Compare All to see all three side‑by‑side.
Mini Metrics (Curated Demo)
Compute metrics for current dataset and settings.
About This Demo
Explainable Semantic TM (xTM) is a retrieval‑first, terminology‑aware, explainable translation memory. It uses semantic search
(FAISS + multilingual E5 embeddings) to find the best existing, approved translation by meaning — and shows why it was chosen.
It is not a generative MT system; when confidence is low, it prefers to return [No translation].
The UI offers Basic (semantic k‑NN), Advanced (multi‑factor scoring with a “Why” panel), Hybrid (dense + BM25 fusion), and Compare All (fuzzy vs semantic) modes.
What It Is / Isn’t
Is: a semantic “translation memory by meaning” that retrieves trusted translations from your bilingual corpus.
Isn’t: a generative translator — no text is invented; guardrails return “[No translation]” when uncertain.
Why different: goes beyond fuzzy character matching; catches paraphrases and synonyms using embeddings.
How It Works (Retrieval, not Generation)
Preprocess: normalize text for stable embeddings.
Embed: multilingual‑E5 with retrieval prefixes → 768‑dim vectors.
Index: build FAISS (IndexFlatIP) over source embeddings.
Retrieve: k‑NN returns nearest source sentences and their targets.
Re‑rank: combine factors — semantic similarity, domain, length, context, terminology (glossary bonuses/penalties with strictness). Optional: hybrid BM25 fusion and cross‑encoder reranker when enabled.
Guardrails: if final score < threshold → return “[No translation]”.
When To Use
Repetitive or regulated content with strict terminology (technology, legal, medical, support).
Teams with a bilingual memory who want retrieval by meaning, not string fuzziness.