ASTRA ATC Models

Fine-tuned models for Singapore military air traffic control, built for the ASTRA training simulator.

Pipeline

Audio  -->  VAD (Silero)  -->  ASR (Whisper)  -->  Rule Formatter  -->  Display Text
                               "camel climb flight level zero nine zero"
                                                                        "CAMEL climb FL090"

The production pipeline uses a rule-based formatter (23 deterministic rules, <1ms, 0 VRAM) instead of the LLM. The LLM is retained for reference.

Models

ASR/

Fine-tuned for Singapore military ATC speech. Uses CTranslate2 float16 format for fast inference with faster-whisper.

Metric	Value
WER	0.66%
Base model	`openai/whisper-large-v3`
Size	2.9 GB
Training	Full fine-tune with enhanced VHF radio augmentation

LLM/

Legacy. Superseded by a deterministic rule-based formatter. Retained for reference.

Converts normalized ASR output into structured ATC display text.

Metric	Value
Exact match	100% (161/161)
Base model	`unsloth/Qwen3-1.7B`
Size	3.3 GB

Architecture

Audio --> VAD (Silero) --> ASR (Whisper ct2) --> Post-processing --> Rule Formatter --> Display Text

Component	Technology	Latency	VRAM
VAD	Silero VAD (ONNX)	~50ms	<100 MB
ASR	Whisper Large v3 (CTranslate2)	~500ms-2s	~2 GB
Formatter	23 deterministic rules	<1ms	0 MB

Total VRAM: ~2 GB (ASR only).

Domain

Singapore military ATC covering:

Airbases: Tengah (WSAT, runway 18/36), Paya Lebar (WSAP, runway 02/20)
Aircraft: F-16C/D, F-15SG, C-130, Hercules
Approaches: ILS, GCA, PAR, TACAN, DVOR/DME, VOR/DME, Visual Straight-in
100+ callsigns: CAMEL, NINJA, BEETLE, TAIPAN, MAVERICK, JAGUAR, LANCER, etc.
Categories: departure, approach, handoff, maneuver, landing, emergency, ground, recovery, pilot reports, military-specific ops

Training History

ASR

Run	WER	Base	Key Change
ct2_run5	0.48%	jacktol/whisper-large-v3-finetuned-for-ATC	Initial fine-tune
ct2_run6	0.40%	jacktol/whisper-large-v3-finetuned-for-ATC	+augmentation, weight decay
ct2_run7	0.24%	jacktol/whisper-large-v3-finetuned-for-ATC	Frozen encoder, +50 real recordings
ct2_run8	0.66%	openai/whisper-large-v3	Full retrain from base, enhanced augmentation

ct2_run8 trains from the original Whisper base for better generalisation to real-world ATC audio.

LLM (Legacy)

Run	Accuracy	Key Change
llm_run3	98.1% (Qwen3-8B)	QLoRA 4-bit, 871 examples
llm_run4	100% (Qwen3-1.7B)	bf16 LoRA, 1,915 examples with ASR noise augmentation

Quick Start

ASR

from faster_whisper import WhisperModel

model = WhisperModel("./ASR", device="cuda", compute_type="float16")
segments, info = model.transcribe("audio.wav", language="en", beam_size=5)
text = " ".join(seg.text.strip() for seg in segments)

Download

# Full repo (ASR + LLM)
huggingface-cli download aether-raid/astra-atc-models --local-dir ./models

# ASR only (recommended)
huggingface-cli download aether-raid/astra-atc-models --include "ASR/*" --local-dir ./models

# LLM only (legacy)
huggingface-cli download aether-raid/astra-atc-models --include "LLM/*" --local-dir ./models

Downloads last month: -; Downloads are not tracked for this model. How to track