Adlik, an LF AI & Data Foundation Incubation-Stage Project, has released version 0.4.0, called Deer. Adlik is a toolkit for accelerating deep learning inference, which provides an overall support for bringing trained models into production and eases the learning curves for different kinds of inference frameworks. In Adlik, Model Optimizer and Model Compiler delivers optimized and compiled models for a certain hardware environment, and Serving Engine provides deployment solutions for cloud, edge and device.
In this release, Adlik made quite a lot of technical explorations, like multi-teacher distillation, Zen-NAS optimization. And it now becomes much easier for you to use Adlik with more runtime on more hardware. In Adlik’s inference optimization on Bert model, Ansor is used to search the optimal tensor scheduling solution globally, a special scheduler is provided to optimize dynamic shape inference, and we achieve a higher throughput on x86 CPUs than OpenVINO. A few highlights of the new release include the following:
Compiler
- Adlik compiler supports OpenVINO INT8 quantization.
- Adlik compiler supports TensorRT INT8 quantization. Supports extended quantization calibrator for TensorRT for reducing the accuracy drop caused by quantization.
Optimizer
- Support multi-teacher distillation method, which uses multi-teacher networks for distillation optimization.
- Support ZEN-NAS search enhancement features, including parallel training, optimization for search acceleration, fix the bugs of original implementation etc. The consumed search time is reduced by about 15%, when the search score is slightly improved, and the training accuracy of the searched model is increased by 0.2% ~1%.
Serving Engine
- Support Paddle Inference Runtime. When using Paddle-format model, converting model format through Onnx components is not needed, and users can directly perform model inference in the Adlik environment.
- Support Intel TGL-U i5 device inference, and complete benchmark tests with several models.
- Docker images for cloud native environments support newest version of inference components including:
(1) OpenVINO: version 2021.4.582
(2)TensorFlow: 2.6.2
(3)TensorRT: 7.2.1.6
(4) Tf-lite: 2.4.0
(5) TVM: 0.7
(6) Paddle Inference: 2.1.2 - Introduce C++ version of Client API, which supports cmake and bazel compilation, and is convenient for users to deploy in C/C++ scenarios.
Benchmark Test
- Complete Benchmark tests of Resnet-50, Yolo v3/v4, FastRCNN, MaskRCNN and other models on Intel TGL-U i5 equipment, including latency, throughput, and various performance indicators under GPU video decoding.
The Adlik Project invites you to adopt or upgrade to Deer, version 0.4.0, and welcomes feedback. To learn more about the Adlik 0.4.0 release, check out the full release notes. Want to get involved with Adlik? Be sure to join the Adlik-Announce and Adlik Technical-Discuss mailing lists to join the community and stay connected on the latest updates.
Adlik Key Links
LF AI & Data Resources
- Learn about membership opportunities
- Explore the interactive landscape
- Check out our technical projects
- Join us at upcoming events
- Read the latest announcements on the blog
- Subscribe to the mailing lists
- Follow us on Twitter or LinkedIn
- Access other resources on LF AI & Data’s GitHub or Wiki