ASTRA ATC Models

Fine-tuned models for Singapore military air traffic control, built for the ASTRA training simulator.

Pipeline

Audio  -->  VAD (Silero)  -->  ASR (Whisper)  -->  Rule Formatter  -->  Display Text
                               "camel climb flight level zero nine zero"
                                                                        "CAMEL climb FL090"

The production pipeline uses a rule-based formatter (23 deterministic rules, <1ms, 0 VRAM) instead of the LLM. The LLM is retained for reference.

Models

ASR/

Fine-tuned for Singapore military ATC speech. Uses CTranslate2 float16 format for fast inference with faster-whisper.

Metric Value
WER 0.66%
Base model openai/whisper-large-v3
Size 2.9 GB
Training Full fine-tune with enhanced VHF radio augmentation

LLM/

Legacy. Superseded by a deterministic rule-based formatter. Retained for reference.

Converts normalized ASR output into structured ATC display text.

Metric Value
Exact match 100% (161/161)
Base model unsloth/Qwen3-1.7B
Size 3.3 GB

Architecture

Audio --> VAD (Silero) --> ASR (Whisper ct2) --> Post-processing --> Rule Formatter --> Display Text
Component Technology Latency VRAM
VAD Silero VAD (ONNX) ~50ms <100 MB
ASR Whisper Large v3 (CTranslate2) ~500ms-2s ~2 GB
Formatter 23 deterministic rules <1ms 0 MB

Total VRAM: ~2 GB (ASR only).

Domain

Singapore military ATC covering:

  • Airbases: Tengah (WSAT, runway 18/36), Paya Lebar (WSAP, runway 02/20)
  • Aircraft: F-16C/D, F-15SG, C-130, Hercules
  • Approaches: ILS, GCA, PAR, TACAN, DVOR/DME, VOR/DME, Visual Straight-in
  • 100+ callsigns: CAMEL, NINJA, BEETLE, TAIPAN, MAVERICK, JAGUAR, LANCER, etc.
  • Categories: departure, approach, handoff, maneuver, landing, emergency, ground, recovery, pilot reports, military-specific ops

Training History

ASR

Run WER Base Key Change
ct2_run5 0.48% jacktol/whisper-large-v3-finetuned-for-ATC Initial fine-tune
ct2_run6 0.40% jacktol/whisper-large-v3-finetuned-for-ATC +augmentation, weight decay
ct2_run7 0.24% jacktol/whisper-large-v3-finetuned-for-ATC Frozen encoder, +50 real recordings
ct2_run8 0.66% openai/whisper-large-v3 Full retrain from base, enhanced augmentation

ct2_run8 trains from the original Whisper base for better generalisation to real-world ATC audio.

LLM (Legacy)

Run Accuracy Key Change
llm_run3 98.1% (Qwen3-8B) QLoRA 4-bit, 871 examples
llm_run4 100% (Qwen3-1.7B) bf16 LoRA, 1,915 examples with ASR noise augmentation

Quick Start

ASR

from faster_whisper import WhisperModel

model = WhisperModel("./ASR", device="cuda", compute_type="float16")
segments, info = model.transcribe("audio.wav", language="en", beam_size=5)
text = " ".join(seg.text.strip() for seg in segments)

Download

# Full repo (ASR + LLM)
huggingface-cli download aether-raid/astra-atc-models --local-dir ./models

# ASR only (recommended)
huggingface-cli download aether-raid/astra-atc-models --include "ASR/*" --local-dir ./models

# LLM only (legacy)
huggingface-cli download aether-raid/astra-atc-models --include "LLM/*" --local-dir ./models
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support