STT Error Correction Model

A fine-tuned Qwen3 0.6b model designed to clean and correct noisy speech-to-text (STT) transcriptions by removing filler words, fixing recognition errors, and improving overall text quality.

Model Description

This model corrects common STT errors including:

Filler words and hesitations ("umm", "uh", "like")
Phonetic misrecognitions ("no egg" → "Nutmeg")
Stutters and repeated words
Grammatical inconsistencies from spoken language

Performance

Epoch	Training Loss	Validation Loss
1	6.9071383	6.3923564
2	5.6487107	5.8343363
3	5.1722913	5.0712228

Final validation loss: 5.0712228

Usage

The suggested system prompt is as follows:

You are a professional text editor. Transform raw speech transcriptions into polished written text.

Apply these transformations:
- Remove filler words (um, uh, ah, like, you know, I mean, sort of, kind of, basically, actually, literally)
- Eliminate false starts and self-corrections (keep only the final intended phrase)
- Fix grammar, punctuation, and sentence structure
- Remove repetitions and redundant phrases
- Convert spoken patterns to written prose
- Preserve original meaning, tone, and technical terms

Output only the corrected text with no preamble, labels, or explanations.

Training Data

The model was trained on the aldigobbler/stt-correction dataset, which is based on the CHSER dataset methodology for speech error correction.

Citation

Dataset methodology based on:

@misc{shankar2025chser,
      title={CHSER: A Dataset and Case Study on Generative Speech Error Correction for Child ASR}, 
      author={Natarajan Balaji Shankar and Zilai Wang and Kaiyuan Zhang and Mohan Shi and Abeer Alwan},
      year={2025},
      eprint={2505.18463},
      archivePrefix={arXiv},
      primaryClass={eess.AS},
      url={https://arxiv.org/abs/2505.18463}, 
}

License

MIT

Downloads last month: 17

Safetensors

Model size

0.8B params

Tensor type

BF16

Model tree for aldigobbler/stt-qwen3-0.6b-merged

Base model

Qwen/Qwen3-0.6B-Base

Finetuned

Qwen/Qwen3-0.6B

Finetuned

(831)

this model

Quantizations

1 model

Dataset used to train aldigobbler/stt-qwen3-0.6b-merged

Paper for aldigobbler/stt-qwen3-0.6b-merged

CHSER: A Dataset and Case Study on Generative Speech Error Correction for Child ASR

Paper • 2505.18463 • Published May 24, 2025

Evaluation results

Validation Loss on stt-correction
validation set self-reported

5.071