Collaboration Proposal In-band Telemetry, VM latency measurements and sFlow acceleration February 15, 2018
To Whom It May Concern,
This document represents a formal proposal from Napatech to ONF/CORD and OPNFV to extend the Barometer project to include In-band Telemetry (INT) and VNF latency measurements via the Open vSwitch (OVS) along with hardware accelerated sFlow within OVS.
Rationale VNF latency monitoring enables the operator to gain a better insight into the specific VNF behavior and can help ensure optimal operation by detecting possible VNF bottlenecks. Furthermore, hardware accelerated sFlow generation will reduce the latency from generation to collection as well as offload CPU cores.
Scope Napatech is proposing two solutions for VNF latency monitoring: • Enable INT from Trellis into OVS which will provide end-to-end INT o INT SRC, SINK and PROXY will be developed both in SW and offload in the Napatech NIC HW. The PROXY is needed because OVS will terminate the VXLAN/GENEVE tunnel but also because VNFs might not support INT. The PROXY could potentially only need to convert from VXLAN -> TCP options -> VXLAN, but if that is not possible, a caching mechanism would be required. • Enable VNF latency measurements based on packet fingerprint o This involves making a fingerprint of the packet, storing the time in a static cache with the fingerprint as indexed. The latency measurement is as follows; the time is stored when the packet enters the VNF and then read when the packet exits the VNF. The latency will be a configured time-window average. o The Napatech NIC will offload this functionality to HW. It is expected that the NIC will support an OVS SW version as a minimum requirement so if another NIC is used the fingerprint based latency measurements are not affected. The solutions will have a collectd plugin enabling Barometer support.
Napatech is proposing sFlow acceleration/offload: • Enable sFlow generation inside the NIC to offload OVS and reduce latency o The existing OVS sFlow generation will be accelerated/offloaded on the NIC to reduce the CPU load and sFlow latency. Having lower sFlow latency provides faster loop-0 turnarounds. o Having sFlow in hardware also enables higher sampling rates without affecting CPU load The sFlow acceleration/offload will be transparent to OVS for default sFlow records. Napatech’s plan is to create specific sFlow records for VM latency reporting, which will need to be integrated with existing sFlow collectors.
Resources List of people involved in the Project From CORD those involved in an oversight role will be: - Tom Tofigh (
[email protected]) - Bryan Sullivan (
[email protected]) From Napatech those involved will be: - Michael Lilja (
[email protected]) – Technology Architect - Anders Brandt (
[email protected]) – System Architect - DJ Root (
[email protected]) – VP of Sales - (A team of both SW and HW engineers are at disposal to the architects)