SykoLLM-V6.9

The most powerful model in the SykoLLM family — trained on 8 billion tokens.

SykoLLM-V6.9 is a 391M parameter causal language model, trained from scratch on a carefully curated mixture of high-quality English datasets. It is the latest and most capable model in the SykoLLM series, surpassing all previous versions in both token count and training quality.

Model Details

Property	Value
Architecture	Causal Language Model (Phi-3 based)
Parameters	391,857,152
Context Length	1,024 tokens
Vocabulary Size	50,000
Hidden Size	1,024
Intermediate Size	2,304
Layers	24
Attention Heads	8 (GQA: 2 KV heads)
Precision	bfloat16
Language	English only

Training Details

Property	Value
Total Tokens	~8 Billion
Training Steps	30,000
Effective Batch Size	256 (16 × 2 × 8 cores)
Learning Rate	4e-4 (cosine decay)
Optimizer	Adafactor
Hardware	Google TPU v5e-8
Precision	bfloat16 (XLA native)
Weight Decay	0.05
Warmup Steps	200

Training Data

SykoLLM-V6.9 was trained on a curated mixture of 4 high-quality datasets, interleaved with carefully tuned sampling probabilities:

Dataset	Sampling	Description
openbmb/Ultra-FineWeb	25%	High-quality web text, scored and filtered
openbmb/Ultra-FineWeb-L3	40%	Multi-style synthetic English pretraining data
openbmb/UltraData-Math	20%	High-quality mathematical reasoning data
openbmb/UltraChat	15%	Multi-turn conversational data

All datasets were filtered with a quality score threshold of ≥ 0.85 and additional heuristic filters to remove low-quality, noisy, or excessively long samples.

Chat Format

SykoLLM-V6.9 uses the following chat template:

<|user|>
Your message here<|end|>
<|assistant|>
Model response here<|end|>

For multi-turn conversations:

<|user|>
Hello, how are you?<|end|>
<|assistant|>
I'm doing great, thank you for asking!<|end|>
<|user|>
Can you help me with a math problem?<|end|>
<|assistant|>
Of course! What's the problem?<|end|>

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model_id = "SykoSLM/SykoLLM-V6.9"

tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.bfloat16,
    device_map="auto",
    trust_remote_code=True,
)

prompt = "<|user|>\nWhat is the capital of France?<|end|>\n<|assistant|>\n"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)

with torch.no_grad():
    outputs = model.generate(
        **inputs,
        max_new_tokens=256,
        temperature=0.7,
        top_p=0.9,
        do_sample=True,
    )

print(tokenizer.decode(outputs[0], skip_special_tokens=False))

SykoLLM Family

Model	Tokens	Notes
SykoLLM-V6.9	~8B	Most powerful — current
SykoLLM-V6.8	<8B	Previous version
SykoLLM-V6.6	<8B	Earlier version

Limitations

English only — the model was trained exclusively on English data and does not support other languages.
Context length is limited to 1,024 tokens.
As a base pretrained model, it may produce outputs that are inaccurate, biased, or inappropriate. Use with appropriate safety measures.
Not instruction-tuned — for best results, use the chat format described above.

License

This model is released under the Apache 2.0 License.

Trained with ❤️ by SykoSLM

Downloads last month: -

Safetensors

Model size

0.4B params

Tensor type

BF16