Hugging Face

Team

company

Verified

https://huggingface.co

huggingface

Activity Feed

AI & ML interests

The AI community building the future.

Recent Activity

AdinaY published an article about 1 hour ago

One Year Since the “DeepSeek Moment”

burtenshaw updated a model about 3 hours ago

huggingface/documentation-images

sayakpaul updated a dataset about 6 hours ago

huggingface/diffusers-metadata

View all activity

Papers

FineVision: Open Data Is All You Need

SmolVLM: Redefining small and efficient multimodal models

View all Papers

Articles

AdinaY

posted an update 10 minutes ago

Post

DeepSeek R1 dropped one year ago 🐳 and a lot has changed.

With @irenesolaiman , we’re launching a blog series about how that moment reshaped AI + open source in 2025, starting with strategic shifts and the explosion of new open models in China!

https://huggingface.co/blog/huggingface/one-year-since-the-deepseek-moment

AdinaY

published an article about 1 hour ago

Article

One Year Since the “DeepSeek Moment”

about 1 hour ago

•

burtenshaw

updated a model about 3 hours ago

huggingface/documentation-images

Updated about 3 hours ago • 5

sayakpaul

updated a dataset about 6 hours ago

huggingface/diffusers-metadata

Viewer • Updated about 6 hours ago • 91 • 1.01k • 14

AdinaY

posted an update 1 day ago

Post

1347

Z.ai just released a powerful lightweight option of GLM 4.7

✨ 30B total/3B active - MoE

zai-org/GLM-4.7-Flash

sergiopaniego

posted an update 1 day ago

Post

534

FunctionGemma Tuning Lab is a new no-code tool by @google that lets you fine-tune a model directly from the browser, with no coding knowledge required, using TRL behind the scenes.

blog: https://developers.googleblog.com/a-guide-to-fine-tuning-functiongemma/

try it out: google/functiongemma-tuning-lab

This example builds on a more advanced one for learning fine-tuning with SFT using TRL: https://ai.google.dev/gemma/docs/functiongemma/finetuning-with-functiongemma

1 reply

AdinaY

posted an update 1 day ago

Post

115

Another Chinese model fully trained on domestic chips, released by China Telecom 👀

Tele-AI/TeleChat3-36B-Thinking

TeleChat3-36B-Thinking:
✨ Native support for the Ascend + MindSpore ecosystem
✨ Inspired by DeepSeek’s architecture design, bringing training stability and efficiency gains.

2 replies

sergiopaniego

posted an update 4 days ago

Post

623

TRL v0.27.0 is out!! 🥳

It includes GDPO, the latest variant of GRPO for multi-reward RL ✨
GDPO decouples reward normalization to avoid reward collapse and improve per-reward convergence — developed by
@sliuau @SimonX et al.

Explore the paper: GDPO: Group reward-Decoupled Normalization Policy Optimization for Multi-reward RL Optimization (2601.05242)

Explore the full set of changes here:
https://github.com/huggingface/trl/releases/tag/v0.27.0

AdinaY

posted an update 4 days ago

Post

974

After a VLM, StepFun dropped a new audio model: Step-Audio-R1.1, enabling thinking while speaking 🔥

stepfun-ai/Step-Audio-R1.1

✨ Apache 2.0
✨ Combines dual-brain architecture and acoustic-grounded reasoning to enable real-time dialogue with SOTA-level reasoning

2 replies

akhaliq

submitted a paper to Daily Papers 4 days ago

V-DPM: 4D Video Reconstruction with Dynamic Point Maps

Paper • 2601.09499 • Published 6 days ago • 8

AdinaY

posted an update 5 days ago

Post

1680

We have a new heatmap live on huggingface now🔥

woojun-jung/open-source-release-heatmap-ko

Korean community built their own version to track labs that actively publish open work, inspired by Chinese open source heat map!

This is the open source community at its best ♥️

1 reply

AdinaY

posted an update 6 days ago

Post

655

More lightweight multimodal models are coming 👀

StepFun has been focused on multimodal AI from the very beginning. Their latest release a new foundational model: STEP3-VL🔥
https://huggingface.co/collections/stepfun-ai/step3-vl-10b
✨ 10B - Apache2.0
✨ Leads in the 10B class and competes with models 10–20× larger

AdinaY

posted an update 6 days ago

Post

307

Agentic capability is the new battleground🔥

LongCat-Flash-Thinking-2601, the latest reasoning model from Meituan- LongCat

✨ MoE - 560B total / 27B active
✨ MIT license
✨ Agentic tool use
✨ Multi-environment RL
✨ Parallel + iterative reasoning

meituan-longcat/LongCat-Flash-Thinking-2601

akhaliq

submitted a paper to Daily Papers 6 days ago

UM-Text: A Unified Multimodal Model for Image Understanding

Paper • 2601.08321 • Published 7 days ago • 7

AdinaY

posted an update 6 days ago

Post

305

GLM-Image from Z.ai is out 🔥

It was fully trained on Ascend Atlas 800T A2 with MindSpore, probably the first SOTA multimodal model fully trained on domestic chips 👀

zai-org/GLM-Image

✨ Hybrid Architecture: combined autoregressive + diffusion design delivers strong semantic alignment with high-fidelity details
✨ Strong performance in long, dense, and multilingual text rendering
✨ MIT licensed (VQ tokenizer & ViT weights under Apache 2.0)
✨ Now live on Hugging Face inference provider 🤗

sergiopaniego

posted an update 7 days ago

Post

2876

New REPL environment in OpenEnv available! ✨
Used in the Recursive Language Models (RLM) paper by Alex Zhang.

Ready for inference & post-training using trajectories. Handles long contexts:

> Run Python code in a sandbox
> Make recursive calls to LMs
> Explore data programmatically
> Return final result

Docs: https://meta-pytorch.org/OpenEnv/environments/repl/
Inference script: https://github.com/meta-pytorch/OpenEnv/blob/main/examples/repl_oolong_simple.py

AdinaY

posted an update 7 days ago

Post

2619

From ChatGPT Healthcare to Claude for healthcare, AI in medicine is speeding up🚀

Now BaichuanAI joins with Baichuan-M3 🏥 an open medical LLM trained for clinical decision-making

https://huggingface.co/collections/baichuan-inc/baichuan-m3

✨ 235B - Apache2.0
✨ Lower hallucinations via Fact-Aware RL
✨ Built for long medical chats

2 replies

sergiopaniego

posted an update 8 days ago

Post

361

Recursive Language Models (RLM) is a new interface for LLMs with cool ideas by Alex Zhang!

⚠️ LLMs struggle with long prompts → attention overload & lost info
🔄 RLMs inspect, split & call themselves on chunks, then aggregate results
✅ Handles millions of tokens, reduces noise, improves reasoning
💡 System prompt guides recursion
🎯 RLM trajectories can be used for RL training or distillation (OpenEnv+TRL!!)

We're adding it to OpenEnv (with Kashif Rasul): https://github.com/meta-pytorch/OpenEnv/pull/282

More resources:

> Paper: Recursive Language Models (2512.24601)
> Paper blog: https://alexzhang13.github.io/blog/2025/rlm/
> RLM repo: https://github.com/alexzhang13/rlm

2 replies

AdinaY

posted an update 8 days ago

Post

2810

AgentCPM-Explore🔥 on device agent foundation model released by OpenBMB
openbmb/AgentCPM-Explore
✨ 4B - Apache2.0
✨ Supports 100+ multi-turn environment interactions with search + verification
✨ Full training/inference stack is openly shared as well

AdinaY

posted an update 8 days ago

Post

2570

Based on 2025 Chinese AI Timeline, here are some interesting takeaways:

✨ DeepSeek cadence: They shipped almost every month! (except Feb 2025)

✨ Qwen trajectory: Not a single “hit” model, but an expanding product line. VL/Math/Coder/Reranker/Embedding/Omni/Next/Image

✨ Multimodal trend: Steadily rising share, shifting from generation to editing + tooling.

✨ Reasoning as a main track: more engineered, system-level reasoning.

✨ From foundation to components: growth in infra models (embeddings, rerankers, OCR, speech) signals a move toward deployable stacks.

✨ Ecosystem broadening: more players beyond the top labs.

Follow for more updates👉

zh-ai-community

2 replies

AI & ML interests

Recent Activity

Papers

Articles

One Year Since the “DeepSeek Moment”

On the Shifting Global Compute Landscape

Announcing Hugging Face Fundamentals: A New Learning Track on DataCamp

Yay! Organizations can now publish blog Articles

Team members 187

huggingface's activity

One Year Since the “DeepSeek Moment”