Automotive Industry Eyes Low-Cost, Low-Latency PCIe for Sharing Resources

Automotive Industry Eyes Low-Cost, Low-Latency PCIe for Sharing Resources

September 18th, 2018

By Lynnette Reese, Editor-in-Chief, Embedded Systems Engineering

The reliable, low-latency and cost-effective PCI Express interconnect is looking attractive to the automotive industry. Advanced features enable connecting processors, NVMe drives, and GPUs together to usher in next generation multimedia and autonomous vehicles.

The automotive industry is eyeing Peripheral Component Interconnect Express (PCIe) as a reliable, high-performance, low-latency, low-power, and low-cost solution to rapidly transfer data directly between any PCIe-connected device such as CPUs, GPUs, FPGAs, I/O, and NVMe solid-state storage. Several factors are driving this move to PCIe including latency, throughput, and reliability.

Figure 1: The automotive industry is eyeing PCIe as a means to lend resources (e.g., NVMe storage) on a PCIe network.

Latency in applications like autonomous vehicles is critical; the response to the identification of a hazard must be immediate. Fast processors like GPUs are a vital resource in Artificial Intelligence (AI), including computer vision, but reliably transferring data with as little delay and overhead as possible is critical. PCIe delivers sub-microsecond latencies between automotive components.

Made to efficiently move data within systems, PCIe is a high-throughput interconnect with high-speed lanes. Gen 3 PCIe can move 8 GT/s, which is 985 MB/s per lane. These lanes can be combined up to 16 lanes to 15.75 GB/s, providing a fast method of transferring data. Gen 4 PCIe is coming soon to double this transfer rate per lane and Gen 5 PCIe will double this again in some few years.

PCIe is also a reliable interconnect, with built-in error checking and guaranteed delivery of data. This is all done at very low power levels. The combination of these features makes PCIe a viable solution for automotive applications.

A major component of autonomous driving requires AI models that can correctly identify obstacles, signals, signs, and lane markings. The AI makes inferences about what its cameras see and whether the gathered images fit a given model, then decides on actions to take—a chain of steps that must take place in microseconds. Processors are fast, but connections between devices like processors and storage devices, for example, cannot be the weak link in the real-time identification and action chain. Ethernet has long been a communications backbone in the automotive world, but Ethernet requires a protocol that adds both CPU overhead and communication latency, whereas PCIe devices can communicate directly peer to peer without using memory or CPU resources. PCIe’s advanced features can connect high-performance processors to computing resources with the low-latency needed for high-level ADAS and autonomous vehicles. The additional benefits of flexibility and low cost make it worth a second look, especially for the cost-constrained automotive industry.

PCIe: Low-cost Connection of CPUs to Storage, GPUs, FPGAs, and More
PCIe offers a high level of scalability and modular support for AI platforms. You can create a high-performance, scalable machine learning platform using the latest PCIe technology. PCIe is a mature interconnect technology that has seen regular updates through generations, keeping up with surrounding technology. PCIe 4.0 products are entering the market, and the PCIe Gen 5 standard is expected to be released in 2019.

A key advanced feature in data movement is PCIe’s use of internal Direct Memory Access (DMA) engines. DMA lessens the load on processors. To take advantage of this and other advanced features within PCIe, software is required. This software, which takes advantage of the non-transparent bridging function within PCIe, creates a PCIe advanced fabric. This fabric offers flexibility and enables advanced applications, such as building “composable architectures” that can add and remove devices from multiple hosts within a PCIe Fabric, or connect multiple CPUs, GPUs, FPGAs, and NVMe SSDs through a PCIe switch, or share a NVMe drive among multiple CPUs or GPUs for data storage. The result is a PCIe Fabric that can dynamically assign resources to systems or applications that need them by moving resources from one system to another in microseconds or even nanoseconds.

If PCIe advanced fabric sounds too good to be true, the truth is that advanced features in PCIe come with a learning curve. The advanced features require PCIe hardware to be coupled with advanced software development. Chip makers typically provide some software modules and drivers but do not offer a complete solution. Developers are left to face a learning curve for implementing advanced features. Making things especially tricky is that the current U.S. economy, although blessed with a low unemployment rate, makes talent challenging to find.

However, a Norwegian company has already developed software that provides a near-turnkey solution, cutting time-to-market and reducing labor requirements. This company, Dolphin Interconnect Solutions, has an elegant solution in PCIe for creating composable architecture and taking advantage of other advanced features of PCIe with its eXpressWare Software Suite. The suite includes a SmartIO module that enables features such as surprise hot-add of devices, sharing of NVMe drives among multiple CPUs, and “device lending software.”  Device lending harnesses the benefits of the advanced features in PCIe. Engineers can reconfigure systems and reallocate resources within the PCIe fabric without having to piece together modules and drivers or having to manipulate advanced PCIe capabilities directly.

Figure 2: Composable architecture creates a pool of resources that can be deployed on demand. (Source:

Device lending seamlessly manages a pool of devices while maximizing resources. Device lending achieves both extremely low computing overhead and low latency without requiring any application-specific distribution mechanisms. With device lending, a remote IO resource appears to be local to another device that needs it. CPUs, GPUs, FPGAs, SoCs, NVMe storage drives, and other components in a typical PCIe network fabric “can be added or removed without having to be physically installed in a particular system on the network.”

SmartIO: Part of PCIe’s Ever-Expanding Features and Capabilities
Intel Corporation introduced the original PCI bus specification in the early 1990s. Shortly after that, the PCI Special Interest Group (PCI-SIG) was formed to help computer manufacturers comply with the specification. The first PCI Express (PCIe) standard was released in 2003, with Gen 4.0 announced in June 2017. Dolphin Interconnect Solutions has been involved with industry standards (including PCI, ASI, and PCIe) since the 1990s. The eXpressWare SmartIO software is a turnkey tool that supports multiple processors such as Arm and x86. SmartIO is a flexible and straightforward method of creating a pool of devices that maximize usage by allowing virtualization within the PCIe fabric. The SmartIO software has a number of features including a low-level API (called SISCI) that enables custom drivers for sharing traditional devices such as NVMe drives, as well as an advanced sharing mechanism called Device Lending.

Device Lending with a PCIe Fabric
Many do not take advantage of PCIe’s advanced features as laid out in the PCIe standard. However, PCIe is a stable, standard technology that is widely implemented and has been steadily improved for over two decades. The PCIe technology has demonstrated latencies as low as 300 ns end-to-end and dominates I/O bus technology, seeing prolific use across many industries. PCIe can create a reliable network with low latency performance in applications that require real-time response.

Device lending is an advanced application of the PCIe standard within the SmartIO suite. It works transparently across the PCIe fabric, between devices and modules with no modifications to drivers, operating systems, or existing software applications. Device lending enables a processor, for instance, to have temporary access to a PCIe-connected device that is located remotely on a PCIe network, all while preserving performance. Accessing a remote device is similar in latency to access of a local device, since there is no software overhead in data transfers. (Recall that Ethernet has a protocol overhead in every data transfer, which adds latency.) Devices are temporarily borrowed by any application or system within the fabric, and for as long as necessary. When a resource is no longer needed, it is returned to local use or allocated to another system.

Figure 3: Device lending leverages the advanced features of PCIe, so integrating with other APIs, special bootloaders, and power sequencing are not needed. Note: NTB= non-transparent bridge. (Source:

You can control the Dolphin device lending software with a set of command line tools and options. You can use command line tools directly or integrate them into any other higher-level resource management system. The eXpressWare SmartIO software does not require any particular boot order or power-on sequence. Device lending software does not require you to make changes to a Linux kernel, either; it’s not that complicated. A “borrowed device” gets inserted into the local PCIe device tree, and a transparent device driver receives a “hot-plug” event, signaling that a new resource is available. According to Dolphin, “If the transparent driver needs to re-map a DMA window, the re-map will be performed locally at the borrowing side, very similar to what happens in a virtualized system. The actual performance is system- and device-dependent.” i

Dolphin’s device lending strategy does not require an explicit integration into a unified Application Programming Interface (API) since it works by taking advantage of the inherent properties of PCIe’s advanced features. For detailed information about how device lending works, refer to the whitepaper,  Device Lending in PCI Express Networks by Lars Kristiansen, et al. (PDF).

PCIe is reliable, with a long history of steady updates. The PCIe standard has demonstrated consideration for backward compatibility while advancing connectivity in several areas: lower latency, improved cost-effectiveness, and higher throughput. Backward compatibility better accommodates a long product development cycle typical in the automotive industry. The added benefit of device lending implies that the PCIe standard aims for utility and flexibility, pushing beyond increases in bandwidth with each release. The automotive industry has long been attracted to low-cost, power-efficient solutions but is also facing pressure to compete with the electronic features that smartphones offer. A low-cost, low-latency interconnect network that allows virtualization of resources makes PCIe an excellent choice for the automotive industry as a new network communications path.

Lynnette Reese is Editor-in-Chief, Embedded Intel Solutions and Embedded Systems Engineering, and has been working in various roles as an electrical engineer for over two decades