BiseNet: Optimized for Qualcomm Devices

BiSeNet (Bilateral Segmentation Network) is a novel architecture designed for real-time semantic segmentation. It addresses the challenge of balancing spatial resolution and receptive field by employing a Spatial Path to preserve high-resolution features and a context path to capture sufficient receptive field.

This is based on the implementation of BiseNet found here. This repository contains pre-exported model files optimized for Qualcomm® devices. You can use the Qualcomm® AI Hub Models library to export with custom configurations. More details on model performance across various devices, can be found here.

Qualcomm AI Hub Models uses Qualcomm AI Hub Workbench to compile, profile, and evaluate this model. Sign up to run these models on a hosted Qualcomm® device.

Getting Started

There are two ways to deploy this model on your device:

Option 1: Download Pre-Exported Models

Below are pre-exported model assets ready for deployment.

Runtime Precision Chipset SDK Versions Download
ONNX float Universal QAIRT 2.42, ONNX Runtime 1.24.1 Download
ONNX w8a8 Universal QAIRT 2.42, ONNX Runtime 1.24.1 Download
QNN_DLC float Universal QAIRT 2.43 Download
QNN_DLC w8a8 Universal QAIRT 2.43 Download
TFLITE float Universal QAIRT 2.43, TFLite 2.17.0 Download
TFLITE w8a8 Universal QAIRT 2.43, TFLite 2.17.0 Download

For more device-specific assets and performance metrics, visit BiseNet on Qualcomm® AI Hub.

Option 2: Export with Custom Configurations

Use the Qualcomm® AI Hub Models Python library to compile and export the model with your own:

  • Custom weights (e.g., fine-tuned checkpoints)
  • Custom input shapes
  • Target device and runtime configurations

This option is ideal if you need to customize the model beyond the default configuration provided here.

See our repository for BiseNet on GitHub for usage instructions.

Model Details

Model Type: Model_use_case.semantic_segmentation

Model Stats:

  • Model checkpoint: best_dice_loss_miou_0.655.pth
  • Inference latency: RealTime
  • Input resolution: 720x960
  • Number of parameters: 12.0M
  • Model size (float): 45.7 MB

Performance Summary

Model Runtime Precision Chipset Inference Time (ms) Peak Memory Range (MB) Primary Compute Unit
BiseNet ONNX float Snapdragon® X Elite 30.848 ms 66 - 66 MB NPU
BiseNet ONNX float Snapdragon® 8 Gen 3 Mobile 25.504 ms 73 - 318 MB NPU
BiseNet ONNX float Qualcomm® QCS8550 (Proxy) 32.063 ms 71 - 79 MB NPU
BiseNet ONNX float Qualcomm® QCS9075 49.312 ms 8 - 11 MB NPU
BiseNet ONNX float Snapdragon® 8 Elite For Galaxy Mobile 18.153 ms 65 - 251 MB NPU
BiseNet ONNX float Snapdragon® 8 Elite Gen 5 Mobile 15.113 ms 56 - 255 MB NPU
BiseNet ONNX float Snapdragon® X2 Elite 15.48 ms 64 - 64 MB NPU
BiseNet ONNX w8a8 Snapdragon® X Elite 8.661 ms 19 - 19 MB NPU
BiseNet ONNX w8a8 Snapdragon® 8 Gen 3 Mobile 5.979 ms 18 - 257 MB NPU
BiseNet ONNX w8a8 Qualcomm® QCS6490 239.497 ms 224 - 236 MB CPU
BiseNet ONNX w8a8 Qualcomm® QCS8550 (Proxy) 8.315 ms 16 - 22 MB NPU
BiseNet ONNX w8a8 Qualcomm® QCS9075 10.171 ms 18 - 21 MB NPU
BiseNet ONNX w8a8 Qualcomm® QCM6690 229.436 ms 230 - 238 MB CPU
BiseNet ONNX w8a8 Snapdragon® 8 Elite For Galaxy Mobile 4.752 ms 18 - 218 MB NPU
BiseNet ONNX w8a8 Snapdragon® 7 Gen 4 Mobile 216.926 ms 250 - 258 MB CPU
BiseNet ONNX w8a8 Snapdragon® 8 Elite Gen 5 Mobile 3.811 ms 18 - 219 MB NPU
BiseNet ONNX w8a8 Snapdragon® X2 Elite 3.761 ms 17 - 17 MB NPU
BiseNet QNN_DLC float Snapdragon® X Elite 28.586 ms 8 - 8 MB NPU
BiseNet QNN_DLC float Snapdragon® 8 Gen 3 Mobile 20.622 ms 8 - 285 MB NPU
BiseNet QNN_DLC float Qualcomm® QCS8275 (Proxy) 107.61 ms 2 - 190 MB NPU
BiseNet QNN_DLC float Qualcomm® QCS8550 (Proxy) 28.399 ms 8 - 10 MB NPU
BiseNet QNN_DLC float Qualcomm® SA8775P 38.899 ms 1 - 189 MB NPU
BiseNet QNN_DLC float Qualcomm® QCS9075 55.932 ms 8 - 49 MB NPU
BiseNet QNN_DLC float Qualcomm® QCS8450 (Proxy) 59.788 ms 8 - 277 MB NPU
BiseNet QNN_DLC float Qualcomm® SA7255P 107.61 ms 2 - 190 MB NPU
BiseNet QNN_DLC float Qualcomm® SA8295P 44.235 ms 0 - 212 MB NPU
BiseNet QNN_DLC float Snapdragon® 8 Elite For Galaxy Mobile 15.769 ms 0 - 222 MB NPU
BiseNet QNN_DLC float Snapdragon® 8 Elite Gen 5 Mobile 12.716 ms 8 - 286 MB NPU
BiseNet QNN_DLC float Snapdragon® X2 Elite 14.51 ms 8 - 8 MB NPU
BiseNet QNN_DLC w8a8 Snapdragon® X Elite 10.129 ms 2 - 2 MB NPU
BiseNet QNN_DLC w8a8 Snapdragon® 8 Gen 3 Mobile 6.686 ms 2 - 231 MB NPU
BiseNet QNN_DLC w8a8 Qualcomm® QCS6490 40.354 ms 1 - 13 MB NPU
BiseNet QNN_DLC w8a8 Qualcomm® QCS8275 (Proxy) 20.012 ms 2 - 182 MB NPU
BiseNet QNN_DLC w8a8 Qualcomm® QCS8550 (Proxy) 9.516 ms 2 - 4 MB NPU
BiseNet QNN_DLC w8a8 Qualcomm® SA8775P 10.243 ms 2 - 183 MB NPU
BiseNet QNN_DLC w8a8 Qualcomm® QCS9075 13.131 ms 1 - 12 MB NPU
BiseNet QNN_DLC w8a8 Qualcomm® QCM6690 89.126 ms 2 - 206 MB NPU
BiseNet QNN_DLC w8a8 Qualcomm® QCS8450 (Proxy) 16.144 ms 2 - 231 MB NPU
BiseNet QNN_DLC w8a8 Qualcomm® SA7255P 20.012 ms 2 - 182 MB NPU
BiseNet QNN_DLC w8a8 Qualcomm® SA8295P 12.698 ms 2 - 185 MB NPU
BiseNet QNN_DLC w8a8 Snapdragon® 8 Elite For Galaxy Mobile 5.175 ms 2 - 191 MB NPU
BiseNet QNN_DLC w8a8 Snapdragon® 7 Gen 4 Mobile 13.349 ms 2 - 201 MB NPU
BiseNet QNN_DLC w8a8 Snapdragon® 8 Elite Gen 5 Mobile 4.297 ms 2 - 193 MB NPU
BiseNet QNN_DLC w8a8 Snapdragon® X2 Elite 4.918 ms 2 - 2 MB NPU
BiseNet TFLITE float Snapdragon® 8 Gen 3 Mobile 20.464 ms 31 - 288 MB NPU
BiseNet TFLITE float Qualcomm® QCS8275 (Proxy) 105.803 ms 32 - 247 MB NPU
BiseNet TFLITE float Qualcomm® QCS8550 (Proxy) 28.692 ms 32 - 34 MB NPU
BiseNet TFLITE float Qualcomm® SA8775P 37.637 ms 32 - 247 MB NPU
BiseNet TFLITE float Qualcomm® QCS9075 55.884 ms 0 - 66 MB NPU
BiseNet TFLITE float Qualcomm® QCS8450 (Proxy) 61.614 ms 32 - 306 MB NPU
BiseNet TFLITE float Qualcomm® SA7255P 105.803 ms 32 - 247 MB NPU
BiseNet TFLITE float Qualcomm® SA8295P 44.076 ms 32 - 247 MB NPU
BiseNet TFLITE float Snapdragon® 8 Elite For Galaxy Mobile 15.878 ms 30 - 253 MB NPU
BiseNet TFLITE float Snapdragon® 8 Elite Gen 5 Mobile 12.63 ms 30 - 309 MB NPU
BiseNet TFLITE w8a8 Snapdragon® 8 Gen 3 Mobile 8.825 ms 6 - 237 MB NPU
BiseNet TFLITE w8a8 Qualcomm® QCS6490 47.441 ms 7 - 31 MB NPU
BiseNet TFLITE w8a8 Qualcomm® QCS8275 (Proxy) 20.873 ms 8 - 190 MB NPU
BiseNet TFLITE w8a8 Qualcomm® QCS8550 (Proxy) 11.978 ms 8 - 205 MB NPU
BiseNet TFLITE w8a8 Qualcomm® SA8775P 12.861 ms 8 - 191 MB NPU
BiseNet TFLITE w8a8 Qualcomm® QCS9075 13.153 ms 4 - 28 MB NPU
BiseNet TFLITE w8a8 Qualcomm® QCM6690 92.741 ms 7 - 211 MB NPU
BiseNet TFLITE w8a8 Qualcomm® QCS8450 (Proxy) 16.253 ms 8 - 238 MB NPU
BiseNet TFLITE w8a8 Qualcomm® SA7255P 20.873 ms 8 - 190 MB NPU
BiseNet TFLITE w8a8 Qualcomm® SA8295P 15.329 ms 8 - 193 MB NPU
BiseNet TFLITE w8a8 Snapdragon® 8 Elite For Galaxy Mobile 6.742 ms 6 - 196 MB NPU
BiseNet TFLITE w8a8 Snapdragon® 7 Gen 4 Mobile 15.908 ms 0 - 199 MB NPU
BiseNet TFLITE w8a8 Snapdragon® 8 Elite Gen 5 Mobile 5.519 ms 6 - 200 MB NPU

License

  • The license for the original implementation of BiseNet can be found here.

References

Community

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Paper for qualcomm/BiseNet