ASTRA ATC Models
Fine-tuned models for Singapore military air traffic control, built for the ASTRA training simulator.
Pipeline
Audio --> VAD (Silero) --> ASR (Whisper) --> Rule Formatter --> Display Text
"camel climb flight level zero nine zero"
"CAMEL climb FL090"
The production pipeline uses a rule-based formatter (23 deterministic rules, <1ms, 0 VRAM) instead of the LLM. The LLM is retained for reference.
Models
ASR/
Fine-tuned for Singapore military ATC speech. Uses CTranslate2 float16 format for fast inference with faster-whisper.
| Metric | Value |
|---|---|
| WER | 0.66% |
| Base model | openai/whisper-large-v3 |
| Size | 2.9 GB |
| Training | Full fine-tune with enhanced VHF radio augmentation |
LLM/
Legacy. Superseded by a deterministic rule-based formatter. Retained for reference.
Converts normalized ASR output into structured ATC display text.
| Metric | Value |
|---|---|
| Exact match | 100% (161/161) |
| Base model | unsloth/Qwen3-1.7B |
| Size | 3.3 GB |
Architecture
Audio --> VAD (Silero) --> ASR (Whisper ct2) --> Post-processing --> Rule Formatter --> Display Text
| Component | Technology | Latency | VRAM |
|---|---|---|---|
| VAD | Silero VAD (ONNX) | ~50ms | <100 MB |
| ASR | Whisper Large v3 (CTranslate2) | ~500ms-2s | ~2 GB |
| Formatter | 23 deterministic rules | <1ms | 0 MB |
Total VRAM: ~2 GB (ASR only).
Domain
Singapore military ATC covering:
- Airbases: Tengah (WSAT, runway 18/36), Paya Lebar (WSAP, runway 02/20)
- Aircraft: F-16C/D, F-15SG, C-130, Hercules
- Approaches: ILS, GCA, PAR, TACAN, DVOR/DME, VOR/DME, Visual Straight-in
- 100+ callsigns: CAMEL, NINJA, BEETLE, TAIPAN, MAVERICK, JAGUAR, LANCER, etc.
- Categories: departure, approach, handoff, maneuver, landing, emergency, ground, recovery, pilot reports, military-specific ops
Training History
ASR
| Run | WER | Base | Key Change |
|---|---|---|---|
| ct2_run5 | 0.48% | jacktol/whisper-large-v3-finetuned-for-ATC | Initial fine-tune |
| ct2_run6 | 0.40% | jacktol/whisper-large-v3-finetuned-for-ATC | +augmentation, weight decay |
| ct2_run7 | 0.24% | jacktol/whisper-large-v3-finetuned-for-ATC | Frozen encoder, +50 real recordings |
| ct2_run8 | 0.66% | openai/whisper-large-v3 | Full retrain from base, enhanced augmentation |
ct2_run8 trains from the original Whisper base for better generalisation to real-world ATC audio.
LLM (Legacy)
| Run | Accuracy | Key Change |
|---|---|---|
| llm_run3 | 98.1% (Qwen3-8B) | QLoRA 4-bit, 871 examples |
| llm_run4 | 100% (Qwen3-1.7B) | bf16 LoRA, 1,915 examples with ASR noise augmentation |
Quick Start
ASR
from faster_whisper import WhisperModel
model = WhisperModel("./ASR", device="cuda", compute_type="float16")
segments, info = model.transcribe("audio.wav", language="en", beam_size=5)
text = " ".join(seg.text.strip() for seg in segments)
Download
# Full repo (ASR + LLM)
huggingface-cli download aether-raid/astra-atc-models --local-dir ./models
# ASR only (recommended)
huggingface-cli download aether-raid/astra-atc-models --include "ASR/*" --local-dir ./models
# LLM only (legacy)
huggingface-cli download aether-raid/astra-atc-models --include "LLM/*" --local-dir ./models