ComVo
Collection
Official Hugging Face collection for ComVo from "Toward Complex-Valued Neural Networks for Waveform Generation" (ICLR 2026), including pretrained mode โข 2 items โข Updated
ComVo is a complex-valued neural vocoder for waveform generation based on iSTFT.
Unlike conventional real-valued vocoders that process real and imaginary parts separately, ComVo operates directly in the complex domain using native complex arithmetic.
This enables:
The model also introduces:
Toward Complex-Valued Neural Networks for Waveform Generation
Hyung-Seok Oh, Deok-Hyeon Cho, Seung-Bin Kim, Seong-Whan Lee
ICLR 2026
https://openreview.net/forum?id=U4GXPqm3Va
This model is designed for:
from hf_model import ComVoHF
model = ComVoHF.from_pretrained("hsoh/ComVo-base")
model.eval()
audio = model.from_waveform(wav)
features = model.build_feature_extractor()(wav)
audio = model(features)
| Model | Parameters | Sampling rate |
|---|---|---|
| Base | 13.28M | 24 kHz |
| Large | 114.56M | 24 kHz |
| Model | UTMOS โ | PESQ (wb) โ | PESQ (nb) โ | MRSTFT โ |
|---|---|---|---|---|
| Base | 3.6744 | 3.8219 | 4.0727 | 0.8580 |
| Large | 3.7618 | 3.9993 | 4.1639 | 0.8227 |
Paper: https://openreview.net/forum?id=U4GXPqm3Va
Demo: https://hs-oh-prml.github.io/ComVo/
Code: https://github.com/hs-oh-prml/ComVo
@inproceedings{
oh2026toward,
title={Toward Complex-Valued Neural Networks for Waveform Generation},
author={Hyung-Seok Oh and Deok-Hyeon Cho and Seung-Bin Kim and Seong-Whan Lee},
booktitle={ICLR},
year={2026}
}