12 4 12

Jonna Matthiesen

JonnaMat

AI & ML interests

None yet

Recent Activity

updated a model 15 days ago

embedl/Cosmos-Reason2-2B-W4A16-Edge2

updated a collection 15 days ago

Cosmos-Reason2

updated a collection 15 days ago

Cosmos-Reason2

View all activity

Organizations

updated a model 15 days ago

embedl/Cosmos-Reason2-2B-W4A16-Edge2

Image-Text-to-Text • 2B • Updated 14 days ago • 629 • 12

updated a collection 15 days ago

Cosmos-Reason2

Collection

nvidia/Cosmos-Reason2 multi-modal reasoning models optimized by Embedl. • 13 items • Updated 15 days ago • 4

reacted to HannesVonEssen's post with 🔥 18 days ago

Post

231

📣 I made a visualizer for Hugging Face models: https://hfviewer.com

✨ Simply paste a Hugging Face URL to get an interactive visualization of the architecture!

🔗 The recent Qwen3.6-27B model as an example: https://hfviewer.com/Qwen/Qwen3.6-27B

Feel free to try it out and give me feedback on how it can be improved! ❤️

1 reply

reacted to HannesVonEssen's post with 🔥❤️ 18 days ago

Post

11619

📣 Hugging Face Visualizer, now as Chrome extension!
https://hfviewer.com

✨ After installing, Hugging Face model pages will have an architecture visualization on the model page itself!

🔗 Link:
https://chromewebstore.google.com/detail/hugging-face-viewer/mmadlggmpkpiockpjfepaohcllbnakej

Thanks for all the nice feedback so far! ❤️

5 replies

updated a collection 26 days ago

Cosmos-Reason2

Collection

nvidia/Cosmos-Reason2 multi-modal reasoning models optimized by Embedl. • 13 items • Updated 15 days ago • 4

updated 4 models about 1 month ago

posted an update about 1 month ago

Post

132

⚡ Qwen3.5, up to 1.4× faster. Same quality. Less latency.

We applied FlashHead to the Qwen3.5 family: Novel drop-in replacement of the LM head with measurably lower latency on edge hardware. Benchmarks and models below.

📊 embedl/Edge-Inference-Benchmarks

🤗 https://huggingface.co/collections/embedl/qwen35

updated 3 collections about 1 month ago

NVIDIA Jetson AGX Orin

Collection

Models optimized and bench-marked for NVIDIA Jetson AGX Orin. Memory-efficient and latency-optimized variants designed for real-time edge inference. • 8 items • Updated Apr 29 • 3

NVIDIA Jetson AGX Thor

Collection

Models validated and performance-optimized for NVIDIA Jetson AGX Thor. Tailored for high-performance edge AI workloads. • 7 items • Updated Apr 29 • 1

FlashHead

Collection

Efficient Drop-In Replacement for the Classification Head in Language Model Inference. https://github.com/embedl/flash-head • 24 items • Updated Apr 29 • 2

Jonna Matthiesen

AI & ML interests

Recent Activity

Organizations

JonnaMat's activity