BiseNet: Optimized for Mobile Deployment

Segment images or video by class in real-time on device

BiSeNet (Bilateral Segmentation Network) is a novel architecture designed for real-time semantic segmentation. It addresses the challenge of balancing spatial resolution and receptive field by employing a Spatial Path to preserve high-resolution features and a context path to capture sufficient receptive field.

This model is an implementation of BiseNet found here.

This repository provides scripts to run BiseNet on Qualcomm® devices. More details on model performance across various devices, can be found here.

Model Details

Model Type: Model_use_case.semantic_segmentation
Model Stats:
- Model checkpoint: best_dice_loss_miou_0.655.pth
- Inference latency: RealTime
- Input resolution: 720x960
- Number of parameters: 12.0M
- Model size (float): 45.7 MB

Model	Precision	Device	Chipset	Target Runtime	Inference Time (ms)	Peak Memory Range (MB)	Primary Compute Unit	Target Model
BiseNet	float	QCS8275 (Proxy)	Qualcomm® QCS8275 (Proxy)	TFLITE	105.14 ms	32 - 195 MB	NPU	BiseNet.tflite
BiseNet	float	QCS8275 (Proxy)	Qualcomm® QCS8275 (Proxy)	QNN_DLC	105.598 ms	2 - 163 MB	NPU	BiseNet.dlc
BiseNet	float	QCS8450 (Proxy)	Qualcomm® QCS8450 (Proxy)	TFLITE	55.562 ms	32 - 278 MB	NPU	BiseNet.tflite
BiseNet	float	QCS8450 (Proxy)	Qualcomm® QCS8450 (Proxy)	QNN_DLC	56.142 ms	8 - 252 MB	NPU	BiseNet.dlc
BiseNet	float	QCS8550 (Proxy)	Qualcomm® QCS8550 (Proxy)	TFLITE	27.367 ms	32 - 35 MB	NPU	BiseNet.tflite
BiseNet	float	QCS8550 (Proxy)	Qualcomm® QCS8550 (Proxy)	QNN_DLC	27.633 ms	8 - 10 MB	NPU	BiseNet.dlc
BiseNet	float	QCS8550 (Proxy)	Qualcomm® QCS8550 (Proxy)	ONNX	32.615 ms	63 - 86 MB	NPU	BiseNet.onnx.zip
BiseNet	float	QCS9075 (Proxy)	Qualcomm® QCS9075 (Proxy)	TFLITE	36.987 ms	32 - 194 MB	NPU	BiseNet.tflite
BiseNet	float	QCS9075 (Proxy)	Qualcomm® QCS9075 (Proxy)	QNN_DLC	37.039 ms	2 - 163 MB	NPU	BiseNet.dlc
BiseNet	float	SA7255P ADP	Qualcomm® SA7255P	TFLITE	105.14 ms	32 - 195 MB	NPU	BiseNet.tflite
BiseNet	float	SA7255P ADP	Qualcomm® SA7255P	QNN_DLC	105.598 ms	2 - 163 MB	NPU	BiseNet.dlc
BiseNet	float	SA8255 (Proxy)	Qualcomm® SA8255P (Proxy)	TFLITE	27.974 ms	27 - 30 MB	NPU	BiseNet.tflite
BiseNet	float	SA8255 (Proxy)	Qualcomm® SA8255P (Proxy)	QNN_DLC	27.665 ms	8 - 10 MB	NPU	BiseNet.dlc
BiseNet	float	SA8295P ADP	Qualcomm® SA8295P	TFLITE	42.839 ms	32 - 220 MB	NPU	BiseNet.tflite
BiseNet	float	SA8295P ADP	Qualcomm® SA8295P	QNN_DLC	42.808 ms	0 - 187 MB	NPU	BiseNet.dlc
BiseNet	float	SA8650 (Proxy)	Qualcomm® SA8650P (Proxy)	TFLITE	28.254 ms	32 - 38 MB	NPU	BiseNet.tflite
BiseNet	float	SA8650 (Proxy)	Qualcomm® SA8650P (Proxy)	QNN_DLC	27.615 ms	8 - 10 MB	NPU	BiseNet.dlc
BiseNet	float	SA8775P ADP	Qualcomm® SA8775P	TFLITE	36.987 ms	32 - 194 MB	NPU	BiseNet.tflite
BiseNet	float	SA8775P ADP	Qualcomm® SA8775P	QNN_DLC	37.039 ms	2 - 163 MB	NPU	BiseNet.dlc
BiseNet	float	Samsung Galaxy S24	Snapdragon® 8 Gen 3 Mobile	TFLITE	19.493 ms	31 - 264 MB	NPU	BiseNet.tflite
BiseNet	float	Samsung Galaxy S24	Snapdragon® 8 Gen 3 Mobile	QNN_DLC	19.385 ms	8 - 237 MB	NPU	BiseNet.dlc
BiseNet	float	Samsung Galaxy S24	Snapdragon® 8 Gen 3 Mobile	ONNX	26.19 ms	73 - 269 MB	NPU	BiseNet.onnx.zip
BiseNet	float	Samsung Galaxy S25	Snapdragon® 8 Elite For Galaxy Mobile	TFLITE	18.526 ms	31 - 261 MB	NPU	BiseNet.tflite
BiseNet	float	Samsung Galaxy S25	Snapdragon® 8 Elite For Galaxy Mobile	QNN_DLC	15.886 ms	8 - 208 MB	NPU	BiseNet.dlc
BiseNet	float	Samsung Galaxy S25	Snapdragon® 8 Elite For Galaxy Mobile	ONNX	19.385 ms	65 - 205 MB	NPU	BiseNet.onnx.zip
BiseNet	float	Snapdragon 8 Elite Gen 5 QRD	Snapdragon® 8 Elite Gen5 Mobile	TFLITE	11.724 ms	30 - 214 MB	NPU	BiseNet.tflite
BiseNet	float	Snapdragon 8 Elite Gen 5 QRD	Snapdragon® 8 Elite Gen5 Mobile	QNN_DLC	11.784 ms	8 - 190 MB	NPU	BiseNet.dlc
BiseNet	float	Snapdragon 8 Elite Gen 5 QRD	Snapdragon® 8 Elite Gen5 Mobile	ONNX	15.167 ms	73 - 220 MB	NPU	BiseNet.onnx.zip
BiseNet	float	Snapdragon X Elite CRD	Snapdragon® X Elite	QNN_DLC	27.498 ms	8 - 8 MB	NPU	BiseNet.dlc
BiseNet	float	Snapdragon X Elite CRD	Snapdragon® X Elite	ONNX	31.473 ms	66 - 66 MB	NPU	BiseNet.onnx.zip
BiseNet	w8a8	Dragonwing Q-6690 MTP	Qualcomm® Qcm6690	TFLITE	69.16 ms	6 - 183 MB	NPU	BiseNet.tflite
BiseNet	w8a8	Dragonwing Q-6690 MTP	Qualcomm® Qcm6690	QNN_DLC	83.202 ms	2 - 180 MB	NPU	BiseNet.dlc
BiseNet	w8a8	Dragonwing Q-6690 MTP	Qualcomm® Qcm6690	ONNX	232.951 ms	225 - 239 MB	CPU	BiseNet.onnx.zip
BiseNet	w8a8	Dragonwing RB3 Gen 2 Vision Kit	Qualcomm® QCS6490	TFLITE	40.076 ms	7 - 31 MB	NPU	BiseNet.tflite
BiseNet	w8a8	Dragonwing RB3 Gen 2 Vision Kit	Qualcomm® QCS6490	QNN_DLC	34.191 ms	2 - 13 MB	NPU	BiseNet.dlc
BiseNet	w8a8	Dragonwing RB3 Gen 2 Vision Kit	Qualcomm® QCS6490	ONNX	235.308 ms	221 - 234 MB	CPU	BiseNet.onnx.zip
BiseNet	w8a8	QCS8275 (Proxy)	Qualcomm® QCS8275 (Proxy)	TFLITE	20.517 ms	8 - 164 MB	NPU	BiseNet.tflite
BiseNet	w8a8	QCS8275 (Proxy)	Qualcomm® QCS8275 (Proxy)	QNN_DLC	20.047 ms	2 - 157 MB	NPU	BiseNet.dlc
BiseNet	w8a8	QCS8450 (Proxy)	Qualcomm® QCS8450 (Proxy)	TFLITE	15.74 ms	8 - 215 MB	NPU	BiseNet.tflite
BiseNet	w8a8	QCS8450 (Proxy)	Qualcomm® QCS8450 (Proxy)	QNN_DLC	16.14 ms	2 - 204 MB	NPU	BiseNet.dlc
BiseNet	w8a8	QCS8550 (Proxy)	Qualcomm® QCS8550 (Proxy)	TFLITE	11.826 ms	8 - 10 MB	NPU	BiseNet.tflite
BiseNet	w8a8	QCS8550 (Proxy)	Qualcomm® QCS8550 (Proxy)	QNN_DLC	9.512 ms	2 - 5 MB	NPU	BiseNet.dlc
BiseNet	w8a8	QCS8550 (Proxy)	Qualcomm® QCS8550 (Proxy)	ONNX	8.616 ms	16 - 30 MB	NPU	BiseNet.onnx.zip
BiseNet	w8a8	QCS9075 (Proxy)	Qualcomm® QCS9075 (Proxy)	TFLITE	12.591 ms	8 - 164 MB	NPU	BiseNet.tflite
BiseNet	w8a8	QCS9075 (Proxy)	Qualcomm® QCS9075 (Proxy)	QNN_DLC	10.219 ms	2 - 158 MB	NPU	BiseNet.dlc
BiseNet	w8a8	RB5 (Proxy)	Qualcomm® QCS8250 (Proxy)	TFLITE	165.644 ms	37 - 94 MB	GPU	BiseNet.tflite
BiseNet	w8a8	RB5 (Proxy)	Qualcomm® QCS8250 (Proxy)	ONNX	201.932 ms	211 - 234 MB	CPU	BiseNet.onnx.zip
BiseNet	w8a8	SA7255P ADP	Qualcomm® SA7255P	TFLITE	20.517 ms	8 - 164 MB	NPU	BiseNet.tflite
BiseNet	w8a8	SA7255P ADP	Qualcomm® SA7255P	QNN_DLC	20.047 ms	2 - 157 MB	NPU	BiseNet.dlc
BiseNet	w8a8	SA8255 (Proxy)	Qualcomm® SA8255P (Proxy)	TFLITE	11.844 ms	8 - 10 MB	NPU	BiseNet.tflite
BiseNet	w8a8	SA8255 (Proxy)	Qualcomm® SA8255P (Proxy)	QNN_DLC	9.497 ms	1 - 3 MB	NPU	BiseNet.dlc
BiseNet	w8a8	SA8295P ADP	Qualcomm® SA8295P	TFLITE	14.927 ms	8 - 168 MB	NPU	BiseNet.tflite
BiseNet	w8a8	SA8295P ADP	Qualcomm® SA8295P	QNN_DLC	12.613 ms	2 - 161 MB	NPU	BiseNet.dlc
BiseNet	w8a8	SA8650 (Proxy)	Qualcomm® SA8650P (Proxy)	TFLITE	11.839 ms	8 - 10 MB	NPU	BiseNet.tflite
BiseNet	w8a8	SA8650 (Proxy)	Qualcomm® SA8650P (Proxy)	QNN_DLC	9.527 ms	2 - 4 MB	NPU	BiseNet.dlc
BiseNet	w8a8	SA8775P ADP	Qualcomm® SA8775P	TFLITE	12.591 ms	8 - 164 MB	NPU	BiseNet.tflite
BiseNet	w8a8	SA8775P ADP	Qualcomm® SA8775P	QNN_DLC	10.219 ms	2 - 158 MB	NPU	BiseNet.dlc
BiseNet	w8a8	Samsung Galaxy S24	Snapdragon® 8 Gen 3 Mobile	TFLITE	8.515 ms	8 - 213 MB	NPU	BiseNet.tflite
BiseNet	w8a8	Samsung Galaxy S24	Snapdragon® 8 Gen 3 Mobile	QNN_DLC	6.605 ms	2 - 206 MB	NPU	BiseNet.dlc
BiseNet	w8a8	Samsung Galaxy S24	Snapdragon® 8 Gen 3 Mobile	ONNX	5.989 ms	18 - 209 MB	NPU	BiseNet.onnx.zip
BiseNet	w8a8	Samsung Galaxy S25	Snapdragon® 8 Elite For Galaxy Mobile	TFLITE	6.555 ms	6 - 171 MB	NPU	BiseNet.tflite
BiseNet	w8a8	Samsung Galaxy S25	Snapdragon® 8 Elite For Galaxy Mobile	QNN_DLC	5.119 ms	2 - 163 MB	NPU	BiseNet.dlc
BiseNet	w8a8	Samsung Galaxy S25	Snapdragon® 8 Elite For Galaxy Mobile	ONNX	4.805 ms	18 - 163 MB	NPU	BiseNet.onnx.zip
BiseNet	w8a8	Snapdragon 7 Gen 4 QRD	Snapdragon® 7 Gen 4 Mobile	TFLITE	14.954 ms	6 - 177 MB	NPU	BiseNet.tflite
BiseNet	w8a8	Snapdragon 7 Gen 4 QRD	Snapdragon® 7 Gen 4 Mobile	QNN_DLC	12.695 ms	2 - 173 MB	NPU	BiseNet.dlc
BiseNet	w8a8	Snapdragon 7 Gen 4 QRD	Snapdragon® 7 Gen 4 Mobile	ONNX	221.072 ms	214 - 231 MB	CPU	BiseNet.onnx.zip
BiseNet	w8a8	Snapdragon 8 Elite Gen 5 QRD	Snapdragon® 8 Elite Gen5 Mobile	TFLITE	5.468 ms	6 - 173 MB	NPU	BiseNet.tflite
BiseNet	w8a8	Snapdragon 8 Elite Gen 5 QRD	Snapdragon® 8 Elite Gen5 Mobile	QNN_DLC	4.199 ms	2 - 166 MB	NPU	BiseNet.dlc
BiseNet	w8a8	Snapdragon 8 Elite Gen 5 QRD	Snapdragon® 8 Elite Gen5 Mobile	ONNX	3.775 ms	0 - 149 MB	NPU	BiseNet.onnx.zip
BiseNet	w8a8	Snapdragon X Elite CRD	Snapdragon® X Elite	QNN_DLC	10.144 ms	2 - 2 MB	NPU	BiseNet.dlc
BiseNet	w8a8	Snapdragon X Elite CRD	Snapdragon® X Elite	ONNX	8.664 ms	19 - 19 MB	NPU	BiseNet.onnx.zip

Installation

Install the package via pip:

pip install qai-hub-models

Configure Qualcomm® AI Hub Workbench to run this model on a cloud-hosted device

Sign-in to Qualcomm® AI Hub Workbench with your Qualcomm® ID. Once signed in navigate to Account -> Settings -> API Token.

With this API token, you can configure your client to run models on the cloud hosted devices.

qai-hub configure --api_token API_TOKEN

Navigate to docs for more information.

Demo off target

The package contains a simple end-to-end demo that downloads pre-trained weights and runs this model on a sample input.

python -m qai_hub_models.models.bisenet.demo

The above demo runs a reference implementation of pre-processing, model inference, and post processing.

NOTE: If you want running in a Jupyter Notebook or Google Colab like environment, please add the following to your cell (instead of the above).

%run -m qai_hub_models.models.bisenet.demo

Run model on a cloud-hosted device

In addition to the demo, you can also run the model on a cloud-hosted Qualcomm® device. This script does the following:

Performance check on-device on a cloud-hosted device
Downloads compiled assets that can be deployed on-device for Android.
Accuracy check between PyTorch and on-device outputs.

python -m qai_hub_models.models.bisenet.export

How does this work?

This export script leverages Qualcomm® AI Hub to optimize, validate, and deploy this model on-device. Lets go through each step below in detail:

Step 1: Compile model for on-device deployment

To compile a PyTorch model for on-device deployment, we first trace the model in memory using the jit.trace and then call the submit_compile_job API.

import torch

import qai_hub as hub
from qai_hub_models.models.bisenet import Model

# Load the model
torch_model = Model.from_pretrained()

# Device
device = hub.Device("Samsung Galaxy S25")

# Trace model
input_shape = torch_model.get_input_spec()
sample_inputs = torch_model.sample_inputs()

pt_model = torch.jit.trace(torch_model, [torch.tensor(data[0]) for _, data in sample_inputs.items()])

# Compile model on a specific device
compile_job = hub.submit_compile_job(
    model=pt_model,
    device=device,
    input_specs=torch_model.get_input_spec(),
)

# Get target model to run on-device
target_model = compile_job.get_target_model()

Step 2: Performance profiling on cloud-hosted device

After compiling models from step 1. Models can be profiled model on-device using the target_model. Note that this scripts runs the model on a device automatically provisioned in the cloud. Once the job is submitted, you can navigate to a provided job URL to view a variety of on-device performance metrics.

profile_job = hub.submit_profile_job(
    model=target_model,
    device=device,
)

Step 3: Verify on-device accuracy

To verify the accuracy of the model on-device, you can run on-device inference on sample input data on the same cloud hosted device.

input_data = torch_model.sample_inputs()
inference_job = hub.submit_inference_job(
    model=target_model,
    device=device,
    inputs=input_data,
)
    on_device_output = inference_job.download_output_data()

With the output of the model, you can compute like PSNR, relative errors or spot check the output with expected output.

Note: This on-device profiling and inference requires access to Qualcomm® AI Hub Workbench. Sign up for access.

Run demo on a cloud-hosted device

You can also run the demo on-device.

python -m qai_hub_models.models.bisenet.demo --eval-mode on-device

NOTE: If you want running in a Jupyter Notebook or Google Colab like environment, please add the following to your cell (instead of the above).

%run -m qai_hub_models.models.bisenet.demo -- --eval-mode on-device

Deploying compiled model to Android

The models can be deployed using multiple runtimes:

TensorFlow Lite (.tflite export): This tutorial provides a guide to deploy the .tflite model in an Android application.
QNN (.so export ): This sample app provides instructions on how to use the .so shared library in an Android application.

View on Qualcomm® AI Hub

Get more details on BiseNet's performance across various devices here. Explore all available models on Qualcomm® AI Hub

License

The license for the original implementation of BiseNet can be found here.

References

Community

Join our AI Hub Slack community to collaborate, post questions and learn more about on-device AI.
For questions or feedback please reach out to us.

Downloads last month: 344

Paper for qualcomm/BiseNet

BiSeNet: Bilateral Segmentation Network for Real-time Semantic Segmentation

Paper • 1808.00897 • Published Aug 2, 2018