code2lora
/

code2lora-direct

Model card Files Files and versions

code2lora-direct / README.md

code2lora's picture

Update dataset/model card

2728b31 verified 12 days ago

|

history blame contribute delete

818 Bytes

	---
	license: mit
	tags: [code, lora, hypernetwork, peft]
	---

	# Code2LoRA — direct-projection hypernetwork

	Final checkpoint of the direct-projection Code2LoRA hypernetwork used in
	the paper. Maps a repository-level embedding into a rank-16 LoRA adapter for
	`Qwen/Qwen2.5-Coder-1.5B` in a single forward pass.

	## Files

	\| File \| Description \|
	\|---\|---\|
	\| `code2lora_direct.pt` \| Trained `Code2LoRAHead` weights (~2.7 GB, fp32). Loaded with `torch.load(map_location="cpu")`. \|

	## Training recipe

	* 3 epochs on the `code2lora/code2lora-data-snapshots` dataset.
	* AdamW + cosine schedule, max-seq-len 8192, bf16, single H100 80 GB.
	* See [`code2lora/code2lora`](https://github.com/) for the trainer code.

	## Companion model

	`code2lora/code2lora-gru` -- the streaming-recurrent variant trained on
	commit deltas.