GLM-4.7-Flash-18B-A3B is a pruned version of GLM-4.7-Flash. Focus on: math, physics, control engineering and academic writing.
Quantization details
Embedding, Attention, Dense FFN Layer, Shared Experts and Output tensors are guarnateed to use full precision (BF16) for all quantized variants.
- Downloads last month
- 101
Hardware compatibility
Log In to add your hardware
2-bit
4-bit
16-bit
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐ Ask for provider support
Model tree for lovedheart/GLM-4.7-Flash-18B-A3B-GGUF
Base model
zai-org/GLM-4.7-Flash