SpiceeChat

FirstName Gender Classifier β€” 30M

Lightweight, fast, and accurate β€” because guessing isn't a strategy.

SpiceeChat License Params Accuracy


Overview

This model is a fine-tuned version of a custom 20M-parameter CausalLM architecture, originally built by PhysiQuanty. It was trained on a combination of:

  • 150,000 samples from the SpiceeChat/Genre-Classifier dataset
  • 922 hand-curated examples to improve coverage and diversity

The result is a compact, production-ready classifier that predicts gender from a first name with ~85% accuracy and no unnecessary overhead.


Quick Start

from transformers import AutoModel, AutoTokenizer

model = AutoModel.from_pretrained(
    "SpiceeChat/FirstName-Genre-Classifier-30M-SFT",
    trust_remote_code=True   # custom architecture, audited and safe
)
tokenizer = AutoTokenizer.from_pretrained(
    "SpiceeChat/FirstName-Genre-Classifier-30M-SFT",
    trust_remote_code=True
)

name = "Arjun"
inputs = tokenizer(name, return_tensors="pt")
pred, probs = model.predict_gender(inputs.input_ids)
gender = "M" if pred.item() == 1 else "F"
print(f"{name} β†’ {gender} (confidence: {probs.max().item():.2f})")

Expected output:

Arjun β†’ M (confidence: 0.98)

Performance

Metric Value
Validation Accuracy 84.74%
Macro F1 81.06%
Parameters ~20M
Model Size 129 MB

Trained for 3 epochs with class weighting (F : M = 3:1) to handle the natural imbalance in the training data. Loss dropped cleanly from 0.41 to 0.34 across training β€” stable convergence, no overfitting.


What Makes This Model Different

  • Handles global names β€” from Wei (Chinese) to Haruto (Japanese) to Ama (Ghanaian)
  • Generalizes beyond dictionaries β€” learns naming patterns rather than relying on lookup tables
  • Custom lightweight architecture β€” small enough to run comfortably on CPU
  • Fully compatible with Hugging Face Transformers β€” loads like any standard model

Training Details

Detail Value
Base model SpiceeChat/Genre-Classifier-1-20M-BASE-BF16
Training data 150,000 + 922 custom examples
Optimizer AdamW (LR = 2e-5)
Batch size 64 (train) / 256 (eval)
Hardware Tesla T4 (FP16)

Notes

  • The model uses weight tying between head.weight and tok_emb.weight. A harmless head.weight | MISSING warning may appear on load β€” this is expected behavior.
  • trust_remote_code=True is required because the architecture is custom. The modeling code is included in this repository and fully auditable.

Try It Yourself

python -c "
from transformers import AutoModel, AutoTokenizer
model = AutoModel.from_pretrained('SpiceeChat/FirstName-Genre-Classifier-30M-SFT', trust_remote_code=True)
tokenizer = AutoTokenizer.from_pretrained('SpiceeChat/FirstName-Genre-Classifier-30M-SFT', trust_remote_code=True)
name = input('Enter a first name: ')
inputs = tokenizer(name, return_tensors='pt')
pred, _ = model.predict_gender(inputs.input_ids)
print('M' if pred.item() == 1 else 'F')
"

License

Released under the Apache 2.0 license. Use it, modify it, ship it β€” no strings attached.


Built with a lot of caffeine β˜• by SpiceeChat

Built by PhysiQuanty(Did the most work) and QuantaSparkLabs.

Downloads last month
-
Safetensors
Model size
32.3M params
Tensor type
F32
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for SpiceeChat/FirstName-Genre-Classifier-30M-SFT

Finetuned
(1)
this model

Dataset used to train SpiceeChat/FirstName-Genre-Classifier-30M-SFT