DenseOn & LateOn Collection A collection of open state-of-the-art single and multi-vector models β’ 7 items β’ Updated 22 days ago β’ 9
view article Article DenseOn with the LateOn: Open State-of-the-Art Single and Multi-Vector Models lightonai β’ 22 days ago β’ 38
view article Article Training and Finetuning Multimodal Embedding & Reranker Models with Sentence Transformers tomaarsen β’ 28 days ago β’ 70
Gemma 4 Collection Gemma 4 is Google's new model family including including E2B, E4B, 26B-A4B, and 31B. β’ 28 items β’ Updated 21 days ago β’ 179
ByT5: Towards a token-free future with pre-trained byte-to-byte models Paper β’ 2105.13626 β’ Published May 28, 2021 β’ 5
pplx-embed Collection Diffusion-Pretrained Dense and Contextual Embeddings β’ 7 items β’ Updated Feb 26 β’ 96
LateOn-Code π» Collection State-of-the-art late interaction code retrieval models β’ 6 items β’ Updated Apr 7 β’ 19
view article Article LateOn-Code & ColGrep: LightOn unveils state-of-the-art code retrieval models and code search tooling lightonai β’ Feb 12 β’ 56
view article Article LightOnOCR-2-1B: a lightweight high-performance end-to-end OCR model family lightonai β’ Jan 19 β’ 93
view article Article Sentence Transformers is joining Hugging Face! tomaarsen β’ Oct 22, 2025 β’ 88
PyLate π Collection State-of-the-art late interaction models trained using PyLate β’ 7 items β’ Updated 1 day ago β’ 5