TrOCR: Optimized for Qualcomm Devices
End-to-end text recognition approach with pre-trained image transformer and text transformer models for both image understanding and wordpiece-level text generation.
This is based on the implementation of TrOCR found here. This repository contains pre-exported model files optimized for Qualcomm® devices. You can use the Qualcomm® AI Hub Models library to export with custom configurations. More details on model performance across various devices, can be found here.
Qualcomm AI Hub Models uses Qualcomm AI Hub Workbench to compile, profile, and evaluate this model. Sign up to run these models on a hosted Qualcomm® device.
Getting Started
There are two ways to deploy this model on your device:
Option 1: Download Pre-Exported Models
Below are pre-exported model assets ready for deployment.
| Runtime | Precision | Chipset | SDK Versions | Download |
|---|---|---|---|---|
| ONNX | float | Universal | QAIRT 2.42, ONNX Runtime 1.24.1 | Download |
| QNN_DLC | float | Universal | QAIRT 2.43 | Download |
| TFLITE | float | Universal | QAIRT 2.43, TFLite 2.17.0 | Download |
For more device-specific assets and performance metrics, visit TrOCR on Qualcomm® AI Hub.
Option 2: Export with Custom Configurations
Use the Qualcomm® AI Hub Models Python library to compile and export the model with your own:
- Custom weights (e.g., fine-tuned checkpoints)
- Custom input shapes
- Target device and runtime configurations
This option is ideal if you need to customize the model beyond the default configuration provided here.
See our repository for TrOCR on GitHub for usage instructions.
Model Details
Model Type: Model_use_case.image_to_text
Model Stats:
- Model checkpoint: trocr-small-stage1
- Input resolution: 320x320
- Number of parameters (TrOCRDecoder): 38.3M
- Model size (TrOCRDecoder) (float): 146 MB
- Number of parameters (TrOCREncoder): 23.0M
- Model size (TrOCREncoder) (float): 87.8 MB
Performance Summary
| Model | Runtime | Precision | Chipset | Inference Time (ms) | Peak Memory Range (MB) | Primary Compute Unit |
|---|---|---|---|---|---|---|
| TrOCRDecoder | ONNX | float | Snapdragon® 8 Elite Gen 5 Mobile | 1.154 ms | 0 - 229 MB | NPU |
| TrOCRDecoder | ONNX | float | Snapdragon® X2 Elite | 1.158 ms | 68 - 68 MB | NPU |
| TrOCRDecoder | ONNX | float | Snapdragon® X Elite | 2.323 ms | 67 - 67 MB | NPU |
| TrOCRDecoder | ONNX | float | Snapdragon® 8 Gen 3 Mobile | 1.488 ms | 0 - 256 MB | NPU |
| TrOCRDecoder | ONNX | float | Qualcomm® QCS8550 (Proxy) | 2.146 ms | 0 - 108 MB | NPU |
| TrOCRDecoder | ONNX | float | Qualcomm® QCS9075 | 2.679 ms | 7 - 16 MB | NPU |
| TrOCRDecoder | ONNX | float | Snapdragon® 8 Elite For Galaxy Mobile | 1.227 ms | 0 - 250 MB | NPU |
| TrOCRDecoder | QNN_DLC | float | Snapdragon® 8 Elite Gen 5 Mobile | 1.126 ms | 1 - 212 MB | NPU |
| TrOCRDecoder | QNN_DLC | float | Snapdragon® X2 Elite | 1.566 ms | 7 - 7 MB | NPU |
| TrOCRDecoder | QNN_DLC | float | Snapdragon® X Elite | 2.145 ms | 7 - 7 MB | NPU |
| TrOCRDecoder | QNN_DLC | float | Snapdragon® 8 Gen 3 Mobile | 1.385 ms | 0 - 234 MB | NPU |
| TrOCRDecoder | QNN_DLC | float | Qualcomm® QCS8275 (Proxy) | 4.235 ms | 7 - 144 MB | NPU |
| TrOCRDecoder | QNN_DLC | float | Qualcomm® QCS8550 (Proxy) | 2.183 ms | 3 - 5 MB | NPU |
| TrOCRDecoder | QNN_DLC | float | Qualcomm® SA8775P | 2.852 ms | 7 - 144 MB | NPU |
| TrOCRDecoder | QNN_DLC | float | Qualcomm® QCS9075 | 2.544 ms | 7 - 16 MB | NPU |
| TrOCRDecoder | QNN_DLC | float | Qualcomm® QCS8450 (Proxy) | 2.8 ms | 1 - 215 MB | NPU |
| TrOCRDecoder | QNN_DLC | float | Qualcomm® SA7255P | 4.235 ms | 7 - 144 MB | NPU |
| TrOCRDecoder | QNN_DLC | float | Qualcomm® SA8295P | 2.948 ms | 7 - 128 MB | NPU |
| TrOCRDecoder | QNN_DLC | float | Snapdragon® 8 Elite For Galaxy Mobile | 1.202 ms | 0 - 226 MB | NPU |
| TrOCRDecoder | TFLITE | float | Snapdragon® 8 Elite Gen 5 Mobile | 1.105 ms | 0 - 223 MB | NPU |
| TrOCRDecoder | TFLITE | float | Snapdragon® 8 Gen 3 Mobile | 1.378 ms | 0 - 236 MB | NPU |
| TrOCRDecoder | TFLITE | float | Qualcomm® QCS8275 (Proxy) | 4.26 ms | 0 - 155 MB | NPU |
| TrOCRDecoder | TFLITE | float | Qualcomm® QCS8550 (Proxy) | 2.229 ms | 0 - 2 MB | NPU |
| TrOCRDecoder | TFLITE | float | Qualcomm® SA8775P | 2.886 ms | 0 - 152 MB | NPU |
| TrOCRDecoder | TFLITE | float | Qualcomm® QCS9075 | 2.571 ms | 0 - 83 MB | NPU |
| TrOCRDecoder | TFLITE | float | Qualcomm® QCS8450 (Proxy) | 2.809 ms | 0 - 222 MB | NPU |
| TrOCRDecoder | TFLITE | float | Qualcomm® SA7255P | 4.26 ms | 0 - 155 MB | NPU |
| TrOCRDecoder | TFLITE | float | Qualcomm® SA8295P | 2.977 ms | 0 - 136 MB | NPU |
| TrOCRDecoder | TFLITE | float | Snapdragon® 8 Elite For Galaxy Mobile | 1.178 ms | 0 - 237 MB | NPU |
| TrOCREncoder | ONNX | float | Snapdragon® 8 Elite Gen 5 Mobile | 7.283 ms | 16 - 325 MB | NPU |
| TrOCREncoder | ONNX | float | Snapdragon® X2 Elite | 7.481 ms | 48 - 48 MB | NPU |
| TrOCREncoder | ONNX | float | Snapdragon® X Elite | 18.683 ms | 48 - 48 MB | NPU |
| TrOCREncoder | ONNX | float | Snapdragon® 8 Gen 3 Mobile | 12.6 ms | 16 - 404 MB | NPU |
| TrOCREncoder | ONNX | float | Qualcomm® QCS8550 (Proxy) | 18.208 ms | 0 - 85 MB | NPU |
| TrOCREncoder | ONNX | float | Qualcomm® QCS9075 | 22.038 ms | 15 - 19 MB | NPU |
| TrOCREncoder | ONNX | float | Snapdragon® 8 Elite For Galaxy Mobile | 8.961 ms | 16 - 331 MB | NPU |
| TrOCREncoder | QNN_DLC | float | Snapdragon® 8 Elite Gen 5 Mobile | 7.261 ms | 2 - 304 MB | NPU |
| TrOCREncoder | QNN_DLC | float | Snapdragon® X2 Elite | 8.001 ms | 2 - 2 MB | NPU |
| TrOCREncoder | QNN_DLC | float | Snapdragon® X Elite | 18.754 ms | 2 - 2 MB | NPU |
| TrOCREncoder | QNN_DLC | float | Snapdragon® 8 Gen 3 Mobile | 12.7 ms | 2 - 382 MB | NPU |
| TrOCREncoder | QNN_DLC | float | Qualcomm® QCS8275 (Proxy) | 47.585 ms | 2 - 291 MB | NPU |
| TrOCREncoder | QNN_DLC | float | Qualcomm® QCS8550 (Proxy) | 18.099 ms | 2 - 3 MB | NPU |
| TrOCREncoder | QNN_DLC | float | Qualcomm® SA8775P | 20.841 ms | 2 - 290 MB | NPU |
| TrOCREncoder | QNN_DLC | float | Qualcomm® QCS9075 | 22.164 ms | 2 - 12 MB | NPU |
| TrOCREncoder | QNN_DLC | float | Qualcomm® QCS8450 (Proxy) | 28.494 ms | 0 - 357 MB | NPU |
| TrOCREncoder | QNN_DLC | float | Qualcomm® SA7255P | 47.585 ms | 2 - 291 MB | NPU |
| TrOCREncoder | QNN_DLC | float | Qualcomm® SA8295P | 25.762 ms | 2 - 290 MB | NPU |
| TrOCREncoder | QNN_DLC | float | Snapdragon® 8 Elite For Galaxy Mobile | 8.942 ms | 2 - 311 MB | NPU |
| TrOCREncoder | TFLITE | float | Snapdragon® 8 Elite Gen 5 Mobile | 4.615 ms | 6 - 151 MB | NPU |
| TrOCREncoder | TFLITE | float | Snapdragon® 8 Gen 3 Mobile | 8.5 ms | 3 - 231 MB | NPU |
| TrOCREncoder | TFLITE | float | Qualcomm® QCS8275 (Proxy) | 38.752 ms | 7 - 150 MB | NPU |
| TrOCREncoder | TFLITE | float | Qualcomm® QCS8550 (Proxy) | 12.117 ms | 7 - 9 MB | NPU |
| TrOCREncoder | TFLITE | float | Qualcomm® SA8775P | 14.627 ms | 7 - 145 MB | NPU |
| TrOCREncoder | TFLITE | float | Qualcomm® QCS9075 | 15.802 ms | 6 - 66 MB | NPU |
| TrOCREncoder | TFLITE | float | Qualcomm® QCS8450 (Proxy) | 22.081 ms | 7 - 345 MB | NPU |
| TrOCREncoder | TFLITE | float | Qualcomm® SA7255P | 38.752 ms | 7 - 150 MB | NPU |
| TrOCREncoder | TFLITE | float | Qualcomm® SA8295P | 20.762 ms | 7 - 271 MB | NPU |
| TrOCREncoder | TFLITE | float | Snapdragon® 8 Elite For Galaxy Mobile | 5.848 ms | 7 - 151 MB | NPU |
License
- The license for the original implementation of TrOCR can be found here.
References
- TrOCR: Transformer-based Optical Character Recognition with Pre-trained Models
- Source Model Implementation
Community
- Join our AI Hub Slack community to collaborate, post questions and learn more about on-device AI.
- For questions or feedback please reach out to us.
