Dominick Wirzba
Chronuid
·
AI & ML interests
None yet
Recent Activity
reacted
to
sergiopaniego's
post with 🔥 4 days ago
Qwen3.5 dense (smol 🤏) models just dropped
- natively multimodal
- 0.8B · 2B · 4B · 9B (+ base variants)
- 262K context extensible to 1M
- built-in thinking
fine-tune them with TRL out of the box → SFT, GRPO, DPO and more!
examples: https://huggingface.co/docs/trl/example_overview
collection: https://huggingface.co/collections/Qwen/qwen35 reacted
to
sergiopaniego's
post with 🔥 4 days ago
did you know you can train agentic models with RL deploying the environments on HF Spaces? 🤗
with TRL + OpenEnv, your training script connects to remote environments hosted as Spaces
want to train faster? → just add more Spaces (TRL handles the parallelization natively)
we used this to train a model to solve the trolley problem in CARLA. 2 HF Spaces running a full driving simulator, each on a T4 GPU
full write-up with code and results → https://huggingface.co/blog/sergiopaniego/bringing-carla-to-openenv-trl