Latvian Content Quality System: Keeping AI Text Natural
Multi-layer checking that keeps AI-generated Latvian text natural — free of anglicisms and the feel of machine translation.
The problem: AI writes fluently, but not always like a native
Modern language models write fluently. But in Latvian, fluency is not the same as naturalness. Often the text is grammatically correct yet carries the shadow of English — anglicisms, calques, sentence structures that sound translated. To a reader, especially a business leader, that feeling is immediate: something isn’t quite ours.
This is a real business issue, not just a matter of taste. A publication that sounds like machine translation sends a quality signal — and in the Latvian business environment, that signal is bad.
The approach: not one check, but layers
We didn’t expect a single check to catch everything. Instead we built several layers, each responsible for its own kind of problem. Together they form a sieve that bad text can’t pass through.
The core idea is that different problems call for different tools. Forbidden words can be caught with a list. Spelling — with a morphology check. But naturalness and style are better judged by the language model itself, given a clear evaluation task. No layer alone is enough; together they are.
How it works
The text passes through several layers in sequence.
- Forbidden-word check. The system catches terms and anglicisms that have a clear Latvian alternative.
- Morphology check. A spelling layer verifies word forms against a Latvian dictionary.
- Style and naturalness rating. A language model judges whether the text reads like a native tongue and flags weak spots.
- Decision. If the text fails the critical threshold, it doesn’t go on — it gets rewritten, not published.
This combination lets the system be both precise in the details and sensitive to the overall sound.
Results and lessons
The main gain is reliability. AI-generated Latvian text can go public knowing it has passed a check, rather than hoping it came out well this time.
First lesson: naturalness can’t be reduced to a single rule. The anglicism list can be extended endlessly, but there will always be a phrase that is formally clean and still sounds foreign. That’s exactly why a human or model judgment about the overall sound stays part of the system.
Second lesson: a quality check has to be a gate, not a suggestion. As long as a check only “recommends,” it gets ignored. Once it becomes a condition for publication, quality actually holds its level.
This project underpins our other systems that write in Latvian — a safety net without which automatic publishing would be too risky.