LDARNet-2M

Pretrained LDARNet (~2M params) with learnable DNA tokenization (dynamic chunking + BiMamba-2).

Files

  • model_ckpt_2m.pt — MLM checkpoint with embedded LDarConfig

Download

Clone the code repo and install dependencies, then download the weights:

huggingface-cli download darlednik/LDARNet-2M model_ckpt_2m.pt --local-dir models_ckpts

Load

import torch
from ldar.utils.ckpt import load_ldar_from_ckpt

model, cfg = load_ldar_from_ckpt(
    "models_ckpts/model_ckpt_2m.pt",
    device="cuda",
    dtype=torch.bfloat16,
)

Architecture

Component Layout d_model
Encoder m3t1 — 3× BiMamba-2 + 1 local-attention layer 64
Backbone M6 — 6× BiMamba-2 (+ SwiGLU) 128
Decoder m3 — 3× BiMamba-2 64
  • Compression ratio N = 4
  • Byte vocabulary: {A, C, G, T, N, [MASK], <pad>}

Citation

@misc{ledneva2026ldarnetdnaadaptiverepresentation,
      title={LDARNet: DNA Adaptive Representation Network with Learnable Tokenization for Genomic Modeling},
      author={Daria Ledneva and Denis Kuznetsov},
      year={2026},
      eprint={2606.04552},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2606.04552},
}
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including darlednik/LDARNet-2M

Paper for darlednik/LDARNet-2M