dataset: NickyNicky/ngxson_MiniThinky_v1_deduplicated_11_percent
** full train
** 360 row
** 11 epoch
** max token 512
** time: 2:38:24

no template

<reasoning>
...
</reasoning>
<answer>
...
</answer>
Downloads last month
5
Safetensors
Model size
1B params
Tensor type
BF16
·
Inference Providers NEW
Input a message to start chatting with NickyNicky/Llama-1B-base-GRPO-miniThinky_v0.