Tengine

Introduction

Tengine is a lite, high performance, modular inference engine for embedded device, which powered by OPEN AI LAB.

Quantization Scheme

Requirements:

Deployment:

We provide the example to deploy the quantized model to Tengine with asymmetric quantization.

First export the quantized model to ONNX [mqbench_qmodel_for_tengine.onnx] and dump the quantization parameters [mqbench_qmodel_for_tengine.scale] for activations.
1python main.py -a [model_name] --resume [model_save_path] --deploy --backend tengine_u8
Second convert .onnx file into .tmfile format supported by Tengine (https://tengine-docs.readthedocs.io/en/latest/user_guides/convert_tool.html).
1tm_convert_tool -f onnx -m [mqbench_qmodel_for_tengine.onnx] -o [xxxx.tmfile]
Quantize .tmfile with mqbench_qmodel_for_tengine.scale (ref: https://tengine-docs.readthedocs.io/en/latest/user_guides/quant_tool_uint8.html).
1quant_tool_uint8 -m [xxx.tmfile] -o [xxxx_u8.tmfile] -i ./ -f [mqbench_qmodel_for_tengine.scale]

Validation with pytengine(optional).

python eval_tengine.py  --dataset [path to dataset] -m [xxxx_u8.tmfile]