Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Website
Tasks
HuggingChat
Collections
Languages
Organizations
Community
Blog
Posts
Daily Papers
Learn
Discord
Forum
GitHub
Solutions
Team & Enterprise
Hugging Face PRO
Enterprise Support
Inference Providers
Inference Endpoints
Storage Buckets
Log In
Sign Up
104.8
TFLOPS
82
7
1
Jeffrey Quesnelle
PRO
emozilla
Follow
cliffhop's profile picture
mithun4090's profile picture
SamJoshua's profile picture
4,521 followers
·
12 following
https://jeffq.com
theemozilla
jquesnelle
AI & ML interests
None yet
Recent Activity
authored
a paper
9 days ago
Decoupling the Benefits of Subword Tokenization for Language Model Training via Byte-level Simulation
upvoted
a
paper
10 days ago
Decoupling the Benefits of Subword Tokenization for Language Model Training via Byte-level Simulation
submitted
a paper
13 days ago
Targeted Neuron Modulation via Contrastive Pair Search
View all activity
Organizations
emozilla
's datasets
53
Sort: Recently updated
emozilla/Hermes-3-Preprocessed-Llama3-2samples
Viewer
•
Updated
Jul 23, 2025
•
2
•
11
emozilla/Hermes-3-Preprocessed-Llama3-100samples
Viewer
•
Updated
Jul 23, 2025
•
100
•
7
•
1
emozilla/Hermes-3-Preprocessed-Llama3
Viewer
•
Updated
Jul 23, 2025
•
91.1k
•
13
•
1
emozilla/dolma-v1_7-30B-tokenized-llama2-nanoset
Updated
Jul 9, 2024
•
133
•
1
emozilla/fineweb-10bt-tokenized-datatrove-llama2
Updated
Jul 8, 2024
•
255
•
3
emozilla/fineweb-350bt-tokenized-datatrove-llama2
Updated
Jul 7, 2024
•
331
emozilla/dolma-v1_7-305B-tokenized-llama2-nanoset
Updated
Jun 5, 2024
•
296
emozilla/proofpile-test-tokenized-llama3
Viewer
•
Updated
Jun 5, 2024
•
46.3k
•
76
emozilla/PaulGrahamEssays
Viewer
•
Updated
Jun 1, 2024
•
49
•
35
emozilla/dolma-v1_7-cc_en_head
Viewer
•
Updated
May 30, 2024
•
475M
•
1.47k
•
1
emozilla/dolma-v1_7-c4
Viewer
•
Updated
May 29, 2024
•
250M
•
189
•
2
emozilla/dolma-v1_7-305B-tokenized-llama3-nanoset
Updated
May 29, 2024
•
828
•
1
emozilla/dolma-v1_7-books
Viewer
•
Updated
May 29, 2024
•
56k
•
69
•
2
emozilla/dolma-v1_7-arxiv
Viewer
•
Updated
May 29, 2024
•
1.55M
•
229
•
3
emozilla/dolma-v1_7-algebraic-stack-train
Viewer
•
Updated
May 29, 2024
•
2.83M
•
125
•
1
emozilla/dolma-v1_7-30B
Viewer
•
Updated
May 23, 2024
•
34.5M
•
372
•
1
emozilla/dolma-v1_7-3B
Viewer
•
Updated
May 23, 2024
•
3.4M
•
899
•
1
emozilla/dolma-v1_7-3B-tokenized-llama3-nanoset
Updated
May 23, 2024
•
29
•
1
emozilla/dolma-v1_7-30B-tokenized-llama3-nanoset
Updated
May 20, 2024
•
118
•
1
emozilla/dolma-v1_7-305B
Viewer
•
Updated
May 13, 2024
•
343M
•
724
•
11
emozilla/c4-validation.00000-of-00008
Viewer
•
Updated
Apr 11, 2024
•
45.6k
•
5
emozilla/hermes2-tokenized-llama-alpaca
Viewer
•
Updated
Mar 13, 2024
•
1M
•
5
emozilla/yarn-train-tokenized-8k-mistral
Viewer
•
Updated
Jan 6, 2024
•
417k
•
15
•
2
emozilla/story-summary-training-mistral-9k-1_4_24
Viewer
•
Updated
Jan 4, 2024
•
751
•
6
•
4
emozilla/yarn-train-tokenized-8k-llama
Viewer
•
Updated
Nov 16, 2023
•
213k
•
1.25k
•
1
emozilla/yarn-train-tokenized-32k-mistral
Viewer
•
Updated
Oct 21, 2023
•
104k
•
143
•
3
emozilla/yarn-train-tokenized-16k-mistral
Viewer
•
Updated
Oct 11, 2023
•
208k
•
1.1k
•
14
emozilla/pg19
Viewer
•
Updated
Oct 9, 2023
•
13.8k
•
20.1k
•
18
emozilla/Long-Data-Collections-Fine-Tune
Viewer
•
Updated
Oct 9, 2023
•
98.6k
•
1.06k
•
4
emozilla/Long-Data-Collections-Pretrain-Without-Books
Viewer
•
Updated
Oct 9, 2023
•
9.38M
•
593
•
2
Previous
1
2
Next