Intel’s New Tool for Programming FPGAs…for Artificial Intelligence

July 3rd, 2018

By Lynnette Reese, Editor-in-Chief, Embedded Intel Solutions

Accelerate AI inference. The OpenVINO tool optimizes a DNN model (TensorFlow or Caffe) for running on an Intel CPU, GPU, Movidius, or FPGA. The single-shot command line utility can also convert the model from any Intel AI-suitable platform to another, enabling quick experimentation among platforms. And it’s free.

Intel® has a toolkit called OpenVINO™ that can take a Dynamic Neural Network (DNN) model from TensorFlow™, Caffe, or MXNet to any Intel AI hardware platform in a few seconds. The command line tool optimizes the neural network model and converts it for use on any other AI hardware Intel platform, including Field Programmable Gate Arrays (FPGAs).

The OpenVINO tool, in a nutshell, provides a near-instant conversion of the AI training tools output (model) so it’s compatible to run on any Intel AI HW engine.

Various AI accelerators are available on the market today: GPUs, high performance processors, FPGAs, and custom chips (i.e., ASICs). Accelerators help in AI by rapidly performing numerous calculations, off-loading the main processor. FPGAs hold several advantages, however, given that AI is changing rapidly and constant hardware optimization is a necessity in many areas.

Tony Kau, Marketing Director, Artificial Intelligence, Software and IP Solutions at Intel, states, “FPGAs bring a lot of value at the hardware performance level. FPGAs accommodate many new topologies and primitives in AI that are coming into the space today. FPGAs also bring the ability at a high level to do a lot of optimization around precision bit width and then to really achieve the power performance balance that a customer wants. FPGAs bring a low latency advantage combined with extraordinary flexibility.”

Figure 1: Intel’s free OpenVINO tool optimizes a model for running on an Intel CPU, GPU, VPU (Movidius), or FPGA. OpenVINO, with a single API that works across all Intel accelerators, allows flexibility with customizations in C++ and OpenCL languages. (Source: Yury Gorbachev, Intel, courtesy Embedded Vision Summit).

Historically, FPGAs have been complicated to program, with a steeper learning curve than traditional programming. However, the programming scene for FPGAs is improving. Recently, Intel® released a development tool that allows effective execution of a neural network model from several deep learning training frameworks to any Intel AI Hardware engine, including FPGAs. Open Visual Inference & Neural Network Optimization (OpenVINO™) toolkit performs a one-time, offline, seconds-long conversion. OpenVINO has a single API across all types of Intel standard HW targets and accelerators. You can use OpenVINO to convert a TensorFlow,™ MXNet, or Caffe model to a format that works for Intel standard HW targets including CPUs and CPUs with integrated graphics as well as accelerators such as Vision Processing Units (VPUs) featuring Movidius, and FPGAs.

Figure 2: Tony Kau is the Software and AI Marketing Director in charge of Product Line Management and Marketing of PSG’s Artificial Intelligence and Workload Acceleration SW products with Intel. Kau earned a bachelor’s and master’s degree in electrical engineering from the University of Southern California and an MBA from INSEAD.

The OpenVINO tool, in a nutshell, provides a near-instant conversion of the AI training tools output (model) so it’s compatible to run on any Intel AI HW engine.

Adam Burns, Director of Computer Vision and Digital Surveillance at Intel Corporation, describes how OpenVINO works. “You can train a neural network using TensorFlow or one of the other frameworks, which produces a model. OpenVINO optimizes the model. The result allows you to target various Intel hardware engines that you can choose from to run the model on. OpenVINO optimizes the output for running on an Intel CPU, GPU, Movidius, or FPGA.” OpenVINO’s single API works across all Intel accelerators. Developers do not need to re-design applications for deployment on different targets, allowing for quick experimentation of best-case scenarios on the actual hardware. One of the most important features of the OpenVINO toolkit is the “model zoo,” which contains public and free optimized models. One can use these models for rapid prototyping as well as expediting development and production of applications without having to search for—or train—your own models.

The time required to pre-process a large model by OpenVINO is measured in seconds, not minutes. No model re-training is required. OpenVINO is continuously benchmarked on a wide array of deep learning models (150+ models). The outputs of OpenVINO are checked both for accuracy and functionality across all targets and accelerators. OpenVINO allows flexibility with customizations in C++ and OpenCL languages, and provides a comprehensive validation suite, as well. (See Figure 1.)

OpenVINO is a one-step, command-line driven process. Changes to a model are natural. If you need to change your model, you go back into your deep learning training framework and change the model, then convert the new model using OpenVINO again. Importantly, OpenVINO does not require the original training framework in order to execute.

Figure 3: The Intel Movidius™ Myriad™ X Vision Processing Unit (VPU) is the world’s first system-on-chip shipping with a dedicated Neural Compute Engine (NCE) for accelerating deep learning inferences at the edge. The NCE is an on-chip hardware block specifically designed to run deep neural networks at high speed and low power without compromising accuracy; enabling devices to see, understand, and respond to their environments in real time. (Source: Intel Corp.)

Neural networks come in different sizes, requiring different amounts of memory, so the OpenVINO tool makes it convenient to choose only as much hardware as you need, be it a CPU, an FPGA, or a Movidius processor. The Intel Movidius, for instance, is ideal for ultra-low-power consumption applications using smaller models and executing on a smaller footprint. Developers might find that their existing model, once it is optimized using OpenVINO, fits in a smaller accelerator, saving much in hardware cost.

Optimizing for Power
Optimizing models using OpenVINO is a snap. For example, by applying quantization and converting a model from 32-bit floating point (FP32) to 16-bit floating point (FP16) one can make the model smaller, lowering the required compute, and thus save a lot of power. In some cases, such optimization comes at the cost of less than 1% in accuracy. In future releases, optimization capabilities of OpenVINO will significantly expand, featuring many new functionalities around quantization/ternarization/binarization, pruning, and sparsity.

As Burns points out, “Developers can fit the compute they need into a relatively inexpensive and low-power device. That kind of conversion power in a free tool has generated high interest, so a lot of developers are looking at OpenVINO as a way to easily add capability to an existing system. OpenVINO opens up a whole new world of options to design around because you can just plug in that little low-power, low-cost device and add the benefit of an AI engine to your system.”

Burns is on to something, because according to Tractica, a market research firm, “The edge computing market, where AI computation is done on the device, is expected to represent more than three-quarters of the total market opportunity, with the balance being in cloud/data center environments. Mobile phones will be a major driver of the edge market, and other prominent edge categories include automotive, smart cameras, robots, and drones.”

OpenVINO is a free tool offered by Intel under a mixture of licenses, including the Apache license, depending upon the model. The OpenVino toolkit is available for (64 bit) development platforms running Windows 10 or Linux (Ubuntu 16.04.3 LTS and CentOS 7.4). Supported target platform (64 bit) operating systems include Windows 10 and the Yocto Project’s Poky Jethro v2.03, Ubuntu 16.04.3 LTS, and CentOS 7.4 Linux distributions.

Lynnette Reese is Editor-in-Chief, Embedded Intel Solutions and Embedded Systems Engineering, and has been working in various roles as an electrical engineer for over two decades. She is interested in open source software and hardware, the maker movement, and in increasing the number of women working in STEM so she has a greater chance of talking about something other than football at the water cooler.