🤗 Currently Training

Banaxi Inc. PRO

Banaxi-Tech

·

Banaxi-Tech

AI & ML interests

SLMs, training from scratch, LoRA, TTS, Ternary models

Recent Activity

new activity about 3 hours ago

AxiomicLabs/Open_SLM_Leaderboard:Add https://huggingface.co/BananaMind/MiniBananaMind-v3-9M

updated a model about 4 hours ago

BananaMind/MiniBananaMind-v3-9M

published a model about 4 hours ago

BananaMind/MiniBananaMind-v3-9M

View all activity

Organizations

New activity in AxiomicLabs/Open_SLM_Leaderboard about 3 hours ago

Add https://huggingface.co/BananaMind/MiniBananaMind-v3-9M

#27 opened about 3 hours ago by

updated a model about 4 hours ago

BananaMind/MiniBananaMind-v3-9M

Text Generation • 8.88M • Updated about 4 hours ago • 2

published a model about 4 hours ago

BananaMind/MiniBananaMind-v3-9M

Text Generation • 8.88M • Updated about 4 hours ago • 2

replied to their post about 6 hours ago

Its Llama architecture

replied to their post about 6 hours ago

so the model you are training is "lamma For Casual LLM" or your custom architecture ?

Yes

replied to their post about 7 hours ago

Yeah definitely not, excited to see how it does when its fully trained!

Would be great to add winogrande to the leaderboard!

replied to their post about 7 hours ago

right now it would be #12 on the leaderboard <100M

replied to their post about 7 hours ago

ok then not terrible for mine 75M then

replied to their post about 7 hours ago

Yes, please

replied to their post about 7 hours ago

Benchmarks have gotten better at 210K

Tasks	Version	Filter	n-shot	Metric		Value		Stderr
arc_challenge	1	none	0	acc	↑	0.2073	±	0.0118
		none	0	acc_norm	↑	0.2398	±	0.0125
arc_easy	1	none	0	acc	↑	0.4928	±	0.0103
		none	0	acc_norm	↑	0.4259	±	0.0101
hellaswag	1	none	0	acc	↑	0.2892	±	0.0045
		none	0	acc_norm	↑	0.3077	±	0.0046
piqa	1	none	0	acc	↑	0.6219	±	0.0113
		none	0	acc_norm	↑	0.6028	±	0.0114
winogrande	1	none	0	acc	↑	0.4949	±	0.0141
=== ArithMark-2.0 checkpoint-210000 ===
Average: 0.2588 (647/2500)

replied to their post about 7 hours ago

With a MIN_LR of 3e-5 and LR=3e-4

New activity in LH-Tech-AI/Apex-1.6-Instruct-350M about 7 hours ago

Model idea

#2 opened about 7 hours ago by

updated a model about 10 hours ago

BananaMind/BananaMind-TTS-Neon

Updated about 10 hours ago

New activity in Banaxi-Larp/Free-space about 13 hours ago

🚩 Report: Illegal or restricted content

#1 opened about 13 hours ago by

posted an update about 13 hours ago

Post

72

Give us a follow at BananaMind. We are currently training two new models, so do not miss them when they release.

Here is what we are training:

BananaMind 1.5 Base

Our flagship 75M parameter model.

4096 token context window.
MiniBananaMind V3

Our 9M parameter model, targeting SOTA results for its size.

These will be released in the next few days.

New activity in Banaxi-Larp/supralarps-50m-base about 14 hours ago

🚩 Report: Illegal or restricted content

#6 opened about 14 hours ago by

replied to their post about 16 hours ago

Tasks	Version	Filter	n-shot	Metric		Value		Stderr
arc_challenge	1	none	0	acc	↑	0.2073	±	0.0118
		none	0	acc_norm	↑	0.2381	±	0.0124
arc_easy	1	none	0	acc	↑	0.4882	±	0.0103
		none	0	acc_norm	↑	0.4234	±	0.0101
hellaswag	1	none	0	acc	↑	0.2880	±	0.0045
		none	0	acc_norm	↑	0.3038	±	0.0046
piqa	1	none	0	acc	↑	0.6153	±	0.0114
		none	0	acc_norm	↑	0.5968	±	0.0114
winogrande	1	none	0	acc	↑	0.5107	±	0.0140

=== ArithMark-2.0 checkpoint-150000 ===
Average: 0.2540  (635/2500)
Random chance: 0.2500

By difficulty:
easy: 0.2528  (316/1250)
hard: 0.2320  (116/500)
medium: 0.2707  (203/750)

By operator_count:
1: 0.2528  (316/1250)
2: 0.2707  (203/750)
3: 0.2320  (116/500)

By topic:
addition: 0.1543  (83/538)
division: 0.5000  (65/130)
mixed_three_ops: 0.2479  (60/242)
mixed_two_ops: 0.2456  (97/395)
multiplication: 0.4375  (63/144)
parentheses_three_ops: 0.2171  (56/258)
parentheses_two_ops: 0.2986  (106/355)
subtraction: 0.2397  (105/438)

replied to their post about 16 hours ago

Im at 150K and the benchmarks arent that good, did this also happen to you with Supra 50M Base?

New activity in BananaMind/BananaMind-Content-Safety-Mini-1.5 1 day ago

Just asking...

#1 opened 1 day ago by

New activity in AxiomicLabs/Open_SLM_Leaderboard 1 day ago

Which measure?

#26 opened 1 day ago by