15 10 9

khtsly

AI & ML interests

None yet

Recent Activity

upvoted a paper 8 days ago

LLMs as Noisy Channels: A Shannon Perspective on Model Capacity and Scaling Laws

upvoted a paper 10 days ago

Anti-Self-Distillation for Reasoning RL via Pointwise Mutual Information

upvoted a paper 10 days ago

HRM-Text: Efficient Pretraining Beyond Scaling

View all activity

Organizations

None yet

upvoted a paper 8 days ago

LLMs as Noisy Channels: A Shannon Perspective on Model Capacity and Scaling Laws

Paper • 2605.23901 • Published 12 days ago • 13

upvoted 2 papers 10 days ago

Anti-Self-Distillation for Reasoning RL via Pointwise Mutual Information

Paper • 2605.11609 • Published 22 days ago • 195

HRM-Text: Efficient Pretraining Beyond Scaling

Paper • 2605.20613 • Published 14 days ago • 90

upvoted a paper 11 days ago

Gated DeltaNet-2: Decoupling Erase and Write in Linear Attention

Paper • 2605.22791 • Published 13 days ago • 30

upvoted a paper 12 days ago

Generative Recursive Reasoning

Paper • 2605.19376 • Published 14 days ago • 29

liked a model about 1 month ago

khtsly/luau-coder-preview-28B-A3B-noft

Text Generation • 28B • Updated Apr 26 • 174 • 2

published a model about 1 month ago

khtsly/luau-coder-preview-28B-A3B-noft

Text Generation • 28B • Updated Apr 26 • 174 • 2

updated a model about 1 month ago

khtsly/luau-coder-preview-28B-A3B-noft

Text Generation • 28B • Updated Apr 26 • 174 • 2

updated a dataset about 1 month ago

khtsly/roblox_docs_corpus_text

Viewer • Updated Apr 23 • 1.55k • 27 • 1

New activity in Jackrong/Qwopus-GLM-18B-Merged-GGUF about 1 month ago

merging problem

👀 1

#5 opened about 1 month ago by

khtsly

New activity in google/gemma-4-31B-it about 1 month ago

Can anyone improve the model using the Rys methodology—by duplicating a block of layers?

#60 opened about 2 months ago by

Regrin

updated a dataset about 2 months ago

khtsly/luau-repo-docs-text

Viewer • Updated Apr 16 • 1.64k • 19 • 1

New activity in Kassadin88/GLM-5.1-1000000x about 2 months ago

â character

#3 opened about 2 months ago by

khtsly

upvoted a paper about 2 months ago

Attention Sink in Transformers: A Survey on Utilization, Interpretation, and Mitigation

Paper • 2604.10098 • Published Apr 11 • 82

updated 2 datasets about 2 months ago

khtsly/devforum-roblox-text

Viewer • Updated Apr 13 • 171k • 109 • 1

khtsly/luau-stack-hq

Viewer • Updated Apr 13 • 25.2k • 97 • 2

updated a model about 2 months ago

khtsly/luau-coder-1.5-preview-tokenizer

Updated Apr 9

published a model about 2 months ago

khtsly/luau-coder-1.5-preview-tokenizer

Updated Apr 9

New activity in omarkamali/wikipedia-monthly about 2 months ago

Hashtag (category)

#6 opened about 2 months ago by

khtsly

liked a dataset 2 months ago

khtsly/devforum-roblox-text

Viewer • Updated Apr 13 • 171k • 109 • 1

khtsly

AI & ML interests

Recent Activity

Organizations

khtsly's activity

merging problem

Can anyone improve the model using the Rys methodology—by duplicating a block of layers?

â character

Hashtag (category)

â character