BiseNet: Optimized for Qualcomm Devices
BiSeNet (Bilateral Segmentation Network) is a novel architecture designed for real-time semantic segmentation. It addresses the challenge of balancing spatial resolution and receptive field by employing a Spatial Path to preserve high-resolution features and a context path to capture sufficient receptive field.
This is based on the implementation of BiseNet found here. This repository contains pre-exported model files optimized for Qualcomm® devices. You can use the Qualcomm® AI Hub Models library to export with custom configurations. More details on model performance across various devices, can be found here.
Qualcomm AI Hub Models uses Qualcomm AI Hub Workbench to compile, profile, and evaluate this model. Sign up to run these models on a hosted Qualcomm® device.
Getting Started
There are two ways to deploy this model on your device:
Option 1: Download Pre-Exported Models
Below are pre-exported model assets ready for deployment.
| Runtime | Precision | Chipset | SDK Versions | Download |
|---|---|---|---|---|
| ONNX | float | Universal | QAIRT 2.42, ONNX Runtime 1.24.1 | Download |
| ONNX | w8a8 | Universal | QAIRT 2.42, ONNX Runtime 1.24.1 | Download |
| QNN_DLC | float | Universal | QAIRT 2.43 | Download |
| QNN_DLC | w8a8 | Universal | QAIRT 2.43 | Download |
| TFLITE | float | Universal | QAIRT 2.43, TFLite 2.17.0 | Download |
| TFLITE | w8a8 | Universal | QAIRT 2.43, TFLite 2.17.0 | Download |
For more device-specific assets and performance metrics, visit BiseNet on Qualcomm® AI Hub.
Option 2: Export with Custom Configurations
Use the Qualcomm® AI Hub Models Python library to compile and export the model with your own:
- Custom weights (e.g., fine-tuned checkpoints)
- Custom input shapes
- Target device and runtime configurations
This option is ideal if you need to customize the model beyond the default configuration provided here.
See our repository for BiseNet on GitHub for usage instructions.
Model Details
Model Type: Model_use_case.semantic_segmentation
Model Stats:
- Model checkpoint: best_dice_loss_miou_0.655.pth
- Inference latency: RealTime
- Input resolution: 720x960
- Number of parameters: 12.0M
- Model size (float): 45.7 MB
Performance Summary
| Model | Runtime | Precision | Chipset | Inference Time (ms) | Peak Memory Range (MB) | Primary Compute Unit |
|---|---|---|---|---|---|---|
| BiseNet | ONNX | float | Snapdragon® X Elite | 30.848 ms | 66 - 66 MB | NPU |
| BiseNet | ONNX | float | Snapdragon® 8 Gen 3 Mobile | 25.504 ms | 73 - 318 MB | NPU |
| BiseNet | ONNX | float | Qualcomm® QCS8550 (Proxy) | 32.063 ms | 71 - 79 MB | NPU |
| BiseNet | ONNX | float | Qualcomm® QCS9075 | 49.312 ms | 8 - 11 MB | NPU |
| BiseNet | ONNX | float | Snapdragon® 8 Elite For Galaxy Mobile | 18.153 ms | 65 - 251 MB | NPU |
| BiseNet | ONNX | float | Snapdragon® 8 Elite Gen 5 Mobile | 15.113 ms | 56 - 255 MB | NPU |
| BiseNet | ONNX | float | Snapdragon® X2 Elite | 15.48 ms | 64 - 64 MB | NPU |
| BiseNet | ONNX | w8a8 | Snapdragon® X Elite | 8.661 ms | 19 - 19 MB | NPU |
| BiseNet | ONNX | w8a8 | Snapdragon® 8 Gen 3 Mobile | 5.979 ms | 18 - 257 MB | NPU |
| BiseNet | ONNX | w8a8 | Qualcomm® QCS6490 | 239.497 ms | 224 - 236 MB | CPU |
| BiseNet | ONNX | w8a8 | Qualcomm® QCS8550 (Proxy) | 8.315 ms | 16 - 22 MB | NPU |
| BiseNet | ONNX | w8a8 | Qualcomm® QCS9075 | 10.171 ms | 18 - 21 MB | NPU |
| BiseNet | ONNX | w8a8 | Qualcomm® QCM6690 | 229.436 ms | 230 - 238 MB | CPU |
| BiseNet | ONNX | w8a8 | Snapdragon® 8 Elite For Galaxy Mobile | 4.752 ms | 18 - 218 MB | NPU |
| BiseNet | ONNX | w8a8 | Snapdragon® 7 Gen 4 Mobile | 216.926 ms | 250 - 258 MB | CPU |
| BiseNet | ONNX | w8a8 | Snapdragon® 8 Elite Gen 5 Mobile | 3.811 ms | 18 - 219 MB | NPU |
| BiseNet | ONNX | w8a8 | Snapdragon® X2 Elite | 3.761 ms | 17 - 17 MB | NPU |
| BiseNet | QNN_DLC | float | Snapdragon® X Elite | 28.586 ms | 8 - 8 MB | NPU |
| BiseNet | QNN_DLC | float | Snapdragon® 8 Gen 3 Mobile | 20.622 ms | 8 - 285 MB | NPU |
| BiseNet | QNN_DLC | float | Qualcomm® QCS8275 (Proxy) | 107.61 ms | 2 - 190 MB | NPU |
| BiseNet | QNN_DLC | float | Qualcomm® QCS8550 (Proxy) | 28.399 ms | 8 - 10 MB | NPU |
| BiseNet | QNN_DLC | float | Qualcomm® SA8775P | 38.899 ms | 1 - 189 MB | NPU |
| BiseNet | QNN_DLC | float | Qualcomm® QCS9075 | 55.932 ms | 8 - 49 MB | NPU |
| BiseNet | QNN_DLC | float | Qualcomm® QCS8450 (Proxy) | 59.788 ms | 8 - 277 MB | NPU |
| BiseNet | QNN_DLC | float | Qualcomm® SA7255P | 107.61 ms | 2 - 190 MB | NPU |
| BiseNet | QNN_DLC | float | Qualcomm® SA8295P | 44.235 ms | 0 - 212 MB | NPU |
| BiseNet | QNN_DLC | float | Snapdragon® 8 Elite For Galaxy Mobile | 15.769 ms | 0 - 222 MB | NPU |
| BiseNet | QNN_DLC | float | Snapdragon® 8 Elite Gen 5 Mobile | 12.716 ms | 8 - 286 MB | NPU |
| BiseNet | QNN_DLC | float | Snapdragon® X2 Elite | 14.51 ms | 8 - 8 MB | NPU |
| BiseNet | QNN_DLC | w8a8 | Snapdragon® X Elite | 10.129 ms | 2 - 2 MB | NPU |
| BiseNet | QNN_DLC | w8a8 | Snapdragon® 8 Gen 3 Mobile | 6.686 ms | 2 - 231 MB | NPU |
| BiseNet | QNN_DLC | w8a8 | Qualcomm® QCS6490 | 40.354 ms | 1 - 13 MB | NPU |
| BiseNet | QNN_DLC | w8a8 | Qualcomm® QCS8275 (Proxy) | 20.012 ms | 2 - 182 MB | NPU |
| BiseNet | QNN_DLC | w8a8 | Qualcomm® QCS8550 (Proxy) | 9.516 ms | 2 - 4 MB | NPU |
| BiseNet | QNN_DLC | w8a8 | Qualcomm® SA8775P | 10.243 ms | 2 - 183 MB | NPU |
| BiseNet | QNN_DLC | w8a8 | Qualcomm® QCS9075 | 13.131 ms | 1 - 12 MB | NPU |
| BiseNet | QNN_DLC | w8a8 | Qualcomm® QCM6690 | 89.126 ms | 2 - 206 MB | NPU |
| BiseNet | QNN_DLC | w8a8 | Qualcomm® QCS8450 (Proxy) | 16.144 ms | 2 - 231 MB | NPU |
| BiseNet | QNN_DLC | w8a8 | Qualcomm® SA7255P | 20.012 ms | 2 - 182 MB | NPU |
| BiseNet | QNN_DLC | w8a8 | Qualcomm® SA8295P | 12.698 ms | 2 - 185 MB | NPU |
| BiseNet | QNN_DLC | w8a8 | Snapdragon® 8 Elite For Galaxy Mobile | 5.175 ms | 2 - 191 MB | NPU |
| BiseNet | QNN_DLC | w8a8 | Snapdragon® 7 Gen 4 Mobile | 13.349 ms | 2 - 201 MB | NPU |
| BiseNet | QNN_DLC | w8a8 | Snapdragon® 8 Elite Gen 5 Mobile | 4.297 ms | 2 - 193 MB | NPU |
| BiseNet | QNN_DLC | w8a8 | Snapdragon® X2 Elite | 4.918 ms | 2 - 2 MB | NPU |
| BiseNet | TFLITE | float | Snapdragon® 8 Gen 3 Mobile | 20.464 ms | 31 - 288 MB | NPU |
| BiseNet | TFLITE | float | Qualcomm® QCS8275 (Proxy) | 105.803 ms | 32 - 247 MB | NPU |
| BiseNet | TFLITE | float | Qualcomm® QCS8550 (Proxy) | 28.692 ms | 32 - 34 MB | NPU |
| BiseNet | TFLITE | float | Qualcomm® SA8775P | 37.637 ms | 32 - 247 MB | NPU |
| BiseNet | TFLITE | float | Qualcomm® QCS9075 | 55.884 ms | 0 - 66 MB | NPU |
| BiseNet | TFLITE | float | Qualcomm® QCS8450 (Proxy) | 61.614 ms | 32 - 306 MB | NPU |
| BiseNet | TFLITE | float | Qualcomm® SA7255P | 105.803 ms | 32 - 247 MB | NPU |
| BiseNet | TFLITE | float | Qualcomm® SA8295P | 44.076 ms | 32 - 247 MB | NPU |
| BiseNet | TFLITE | float | Snapdragon® 8 Elite For Galaxy Mobile | 15.878 ms | 30 - 253 MB | NPU |
| BiseNet | TFLITE | float | Snapdragon® 8 Elite Gen 5 Mobile | 12.63 ms | 30 - 309 MB | NPU |
| BiseNet | TFLITE | w8a8 | Snapdragon® 8 Gen 3 Mobile | 8.825 ms | 6 - 237 MB | NPU |
| BiseNet | TFLITE | w8a8 | Qualcomm® QCS6490 | 47.441 ms | 7 - 31 MB | NPU |
| BiseNet | TFLITE | w8a8 | Qualcomm® QCS8275 (Proxy) | 20.873 ms | 8 - 190 MB | NPU |
| BiseNet | TFLITE | w8a8 | Qualcomm® QCS8550 (Proxy) | 11.978 ms | 8 - 205 MB | NPU |
| BiseNet | TFLITE | w8a8 | Qualcomm® SA8775P | 12.861 ms | 8 - 191 MB | NPU |
| BiseNet | TFLITE | w8a8 | Qualcomm® QCS9075 | 13.153 ms | 4 - 28 MB | NPU |
| BiseNet | TFLITE | w8a8 | Qualcomm® QCM6690 | 92.741 ms | 7 - 211 MB | NPU |
| BiseNet | TFLITE | w8a8 | Qualcomm® QCS8450 (Proxy) | 16.253 ms | 8 - 238 MB | NPU |
| BiseNet | TFLITE | w8a8 | Qualcomm® SA7255P | 20.873 ms | 8 - 190 MB | NPU |
| BiseNet | TFLITE | w8a8 | Qualcomm® SA8295P | 15.329 ms | 8 - 193 MB | NPU |
| BiseNet | TFLITE | w8a8 | Snapdragon® 8 Elite For Galaxy Mobile | 6.742 ms | 6 - 196 MB | NPU |
| BiseNet | TFLITE | w8a8 | Snapdragon® 7 Gen 4 Mobile | 15.908 ms | 0 - 199 MB | NPU |
| BiseNet | TFLITE | w8a8 | Snapdragon® 8 Elite Gen 5 Mobile | 5.519 ms | 6 - 200 MB | NPU |
License
- The license for the original implementation of BiseNet can be found here.
References
- BiSeNet Bilateral Segmentation Network for Real-time Semantic Segmentation
- Source Model Implementation
Community
- Join our AI Hub Slack community to collaborate, post questions and learn more about on-device AI.
- For questions or feedback please reach out to us.
