smcleish/Qwen3-Embedding-0.6B-Qwen3-4B-Instruct-2507-cs16-summary_mean-bst1024-attn-mlp-ov256 Updated about 6 hours ago
smcleish/Qwen3-Embedding-0.6B-Qwen3-4B-Instruct-2507-cs16-summary_mean-bst1024-attn-mlp-ov256-chunksize-8 Updated about 17 hours ago
smcleish/deepscaler-1.5b-8k-reproduce-first-run-with-shuffle-8k-300-chkpt-step-100 Text Generation • 2B • Updated 2 days ago • 13
smcleish/deepscaler-1.5b-8k-reproduce-first-run-with-shuffle-8k-300-chkpt-step-200 Text Generation • 2B • Updated 2 days ago • 14
smcleish/deepscaler-1.5b-8k-reproduce-first-run-with-shuffle-8k-300-chkpt-step-300 Text Generation • 2B • Updated 2 days ago • 11
smcleish/deepscaler-1.5b-8k-reproduce-first-run-with-shuffle-8k-300-chkpt-step-400 Text Generation • 2B • Updated 2 days ago • 9
smcleish/deepscaler-1.5b-8k-reproduce-first-run-with-shuffle-8k-400-chkpt-step-100 Text Generation • 2B • Updated 2 days ago • 14
smcleish/deepscaler-1.5b-8k-reproduce-first-run-with-shuffle-8k-400-chkpt-step-200 Text Generation • 2B • Updated 2 days ago • 10
smcleish/deepscaler-1.5b-8k-reproduce-first-run-with-shuffle-8k-400-chkpt-step-300 Text Generation • 2B • Updated 2 days ago • 10
smcleish/deepscaler-1.5b-8k-hard-first-run-with-shuffle-8k-400-chkpt-16k-400-chkpt-step-200 Text Generation • 2B • Updated 3 days ago • 16
smcleish/deepscaler-1.5b-8k-easy-first-run-with-shuffle-8k-400-chkpt-16k-200-chkpt-step-200 Text Generation • 2B • Updated 3 days ago • 14
smcleish/deepscaler-1.5b-8k-easy-first-run-with-shuffle-8k-400-chkpt-16k-400-chkpt-step-400 Text Generation • 2B • Updated 3 days ago • 17
smcleish/deepscaler-1.5b-8k-easy-first-run-with-shuffle-8k-400-chkpt-16k-400-chkpt-step-200 Text Generation • 2B • Updated 3 days ago • 11
smcleish/deepscaler-1.5b-8k-reproduce-first-run-with-shuffle-step-400 Text Generation • 2B • Updated 4 days ago • 17
smcleish/deepscaler-1.5b-8k-reproduce-first-run-with-shuffle-step-500 Text Generation • 2B • Updated 4 days ago • 14
smcleish/deepscaler-1.5b-8k-reproduce-first-run-with-shuffle-step-300 Text Generation • 2B • Updated 4 days ago • 14
smcleish/deepscaler-1.5b-8k-reproduce-first-run-with-shuffle-step-200 Text Generation • 2B • Updated 4 days ago • 11
smcleish/deepscaler-1.5b-8k-hard-first-run-with-shuffle-8k-500-chkpt-step-400 Text Generation • 2B • Updated 5 days ago • 16
smcleish/Qwen3-Embedding-0.6B-Qwen3-4B-Instruct-2507-cs16-summary_mean-bst1024-attn Updated 5 days ago
smcleish/deepscaler-1.5b-8k-hard-first-run-with-shuffle-8k-400-chkpt-step-400 Text Generation • 2B • Updated 5 days ago • 15
smcleish/deepscaler-1.5b-8k-easy-first-run-with-shuffle-8k-400-chkpt-step-400 Text Generation • 2B • Updated 5 days ago • 12
smcleish/deepscaler-1.5b-8k-easy-first-run-with-shuffle-8k-500-chkpt-step-400 Text Generation • 2B • Updated 5 days ago • 13
smcleish/deepscaler-1.5b-8k-hard-first-run-with-shuffle-8k-400-chkpt-step-200 Text Generation • 2B • Updated 7 days ago • 9
smcleish/deepscaler-1.5b-8k-hard-first-run-with-shuffle-8k-500-chkpt-step-200 Text Generation • 2B • Updated 7 days ago • 10
smcleish/deepscaler-1.5b-8k-hard-first-run-with-shuffle-step500 Text Generation • 2B • Updated 7 days ago • 10
smcleish/deepscaler-1.5b-8k-easy-first-run-with-shuffle-step500 Text Generation • 2B • Updated 7 days ago • 11
smcleish/deepscaler-1.5b-8k-easy-first-run-with-shuffle-8k-400-chkpt-step-200 Text Generation • 2B • Updated 10 days ago • 13
smcleish/deepscaler-1.5b-8k-easy-first-run-with-shuffle-8k-500-chkpt-step-200 Text Generation • 2B • Updated 10 days ago • 16
smcleish/Qwen3-Embedding-0.6B-Qwen3-4B-Instruct-2507-cs16-summary_mean-bst1024-lr-1e5 Updated 11 days ago
smcleish/Qwen3-Embedding-0.6B-Qwen3-4B-Instruct-2507-cs16-summary_mean-bst1024-lr-3e6 Updated 12 days ago