DeePhi's Quantization Tool
Product Overview
This is a model quantization tool for convolution neural networks(CNN). This tool could quantize both weights/biases and activations from 32-bit floating-point (FP32) format to 8-bit integer(INT8) format or any other bit depths. With this tool, you can boost the inference performance and efficiency significantly, while maintaining the accuracy. This tool supports common layer types in neural networks, including convolution, pooling, fully-connected, batch normalization and so on. The quantization tool does not need the retraining of the network or labeled datasets, only one batch of pictures are needed. The process time ranges from a few seconds to several minutes depending on the size of neural network, which makes rapid model update possible. This tool is collaborative optimized for DeePhi DPU and could generate INT8 format model files required by DNNC. This version of DECENT supports only caffe model format files(caffemodel), so model conversion is usually necessary for other frameworks.
Version
By
DeePhi TechCategories
Operating System
Linux/Unix, Ubuntu 4.4.0-1062-aws
Delivery Methods