4 16 174

Turbo Pascal

TurboPascal

AI & ML interests

None yet

Recent Activity

upvoted a paper about 23 hours ago

QwenLong-L1.5: Post-Training Recipe for Long-Context Reasoning and Memory Management

liked a dataset 6 days ago

MaziyarPanahi/Nemotron-Cascade-2-SFT-Data-Small

liked a dataset 6 days ago

nvidia/Nemotron-Cascade-2-SFT-Data

View all activity

Organizations

upvoted a paper about 23 hours ago

QwenLong-L1.5: Post-Training Recipe for Long-Context Reasoning and Memory Management

Paper • 2512.12967 • Published Dec 15, 2025 • 111

liked 4 datasets 6 days ago

upvoted a paper 8 days ago

GLM-5: from Vibe Coding to Agentic Engineering

Paper • 2602.15763 • Published Feb 17 • 119

upvoted a paper 9 days ago

Golden Goose: A Simple Trick to Synthesize Unlimited RLVR Tasks from Unverifiable Internet Text

Paper • 2601.22975 • Published Jan 30 • 110

liked a model 14 days ago

google/gemma-3-27b-it

Image-Text-to-Text • 27B • Updated Mar 21, 2025 • 1.05M • • 1.94k

New activity in Alibaba-NLP/new-impl about 1 month ago

torch.AcceleratorError: CUDA error: device-side assert triggered

#14 opened about 1 month ago by

TurboPascal

liked a model 6 months ago

HuggingFaceTB/SmolVLM-256M-Instruct

Image-Text-to-Text • 0.3B • Updated Apr 8, 2025 • 317k • 345

upvoted an article 6 months ago

Article

Training and Finetuning Reranker Models with Sentence Transformers v4

Mar 26, 2025

•

188

liked a model 7 months ago

ByteDance-Seed/Seed-OSS-36B-Instruct

Text Generation • Updated Aug 26, 2025 • 22.9k • 493

upvoted a collection 7 months ago

BGE

Collection

31 items • Updated Feb 4 • 152

liked a dataset 7 months ago

HuggingFaceTB/smoltalk2

Viewer • Updated Oct 31, 2025 • 8.61M • 9.69k • 147

liked 2 models 8 months ago

Alibaba-NLP/WebDancer-32B

Text Generation • Updated Jun 26, 2025 • 18 • • 57

zai-org/GLM-4.5V

Image-Text-to-Text • 108B • Updated Oct 25, 2025 • 46.7k • • 711

upvoted a paper 9 months ago

Skywork-Reward-V2: Scaling Preference Data Curation via Human-AI Synergy

Paper • 2507.01352 • Published Jul 2, 2025 • 60

liked a Space 9 months ago

The Ultra-Scale Playbook

🌌

3.76k

The ultimate guide to training LLM on large GPU Clusters

liked 2 models 10 months ago

Qwen/Qwen3-Reranker-0.6B

Text Ranking • 0.6B • Updated Jun 9, 2025 • 909k • 326

Qwen/Qwen3-Embedding-0.6B

Feature Extraction • 0.6B • Updated Jun 20, 2025 • 5.65M • • 953