SykoLLM-V6.9

The most powerful model in the SykoLLM family — trained on 8 billion tokens.

SykoLLM-V6.9 is a 391M parameter causal language model, trained from scratch on a carefully curated mixture of high-quality English datasets. It is the latest and most capable model in the SykoLLM series, surpassing all previous versions in both token count and training quality.


Model Details

Property Value
Architecture Causal Language Model (Phi-3 based)
Parameters 391,857,152
Context Length 1,024 tokens
Vocabulary Size 50,000
Hidden Size 1,024
Intermediate Size 2,304
Layers 24
Attention Heads 8 (GQA: 2 KV heads)
Precision bfloat16
Language English only

Training Details

Property Value
Total Tokens ~8 Billion
Training Steps 30,000
Effective Batch Size 256 (16 × 2 × 8 cores)
Learning Rate 4e-4 (cosine decay)
Optimizer Adafactor
Hardware Google TPU v5e-8
Precision bfloat16 (XLA native)
Weight Decay 0.05
Warmup Steps 200

Training Data

SykoLLM-V6.9 was trained on a curated mixture of 4 high-quality datasets, interleaved with carefully tuned sampling probabilities:

Dataset Sampling Description
openbmb/Ultra-FineWeb 25% High-quality web text, scored and filtered
openbmb/Ultra-FineWeb-L3 40% Multi-style synthetic English pretraining data
openbmb/UltraData-Math 20% High-quality mathematical reasoning data
openbmb/UltraChat 15% Multi-turn conversational data

All datasets were filtered with a quality score threshold of ≥ 0.85 and additional heuristic filters to remove low-quality, noisy, or excessively long samples.


Chat Format

SykoLLM-V6.9 uses the following chat template:

<|user|>
Your message here<|end|>
<|assistant|>
Model response here<|end|>

For multi-turn conversations:

<|user|>
Hello, how are you?<|end|>
<|assistant|>
I'm doing great, thank you for asking!<|end|>
<|user|>
Can you help me with a math problem?<|end|>
<|assistant|>
Of course! What's the problem?<|end|>

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model_id = "SykoSLM/SykoLLM-V6.9"

tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.bfloat16,
    device_map="auto",
    trust_remote_code=True,
)

prompt = "<|user|>\nWhat is the capital of France?<|end|>\n<|assistant|>\n"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)

with torch.no_grad():
    outputs = model.generate(
        **inputs,
        max_new_tokens=256,
        temperature=0.7,
        top_p=0.9,
        do_sample=True,
    )

print(tokenizer.decode(outputs[0], skip_special_tokens=False))

SykoLLM Family

Model Tokens Notes
SykoLLM-V6.9 ~8B Most powerful — current
SykoLLM-V6.8 <8B Previous version
SykoLLM-V6.6 <8B Earlier version

Limitations

  • English only — the model was trained exclusively on English data and does not support other languages.
  • Context length is limited to 1,024 tokens.
  • As a base pretrained model, it may produce outputs that are inaccurate, biased, or inappropriate. Use with appropriate safety measures.
  • Not instruction-tuned — for best results, use the chat format described above.

License

This model is released under the Apache 2.0 License.


Trained with ❤️ by SykoSLM

Downloads last month
-
Safetensors
Model size
0.4B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support