This document provides an overview of integrating Telegraf, InfluxDB, and Grafana to monitor SONiC (Software for Open Networking in the Cloud) devices using gNMI (gRPC Network Management Interface). It highlights the advantages of this setup and compares it with other monitoring solutions.
Components Overviewย
1. Telegraf
- A lightweight, open-source server agent for collecting and sending metrics.
- Supports multiple input plugins, including gNMI, to collect telemetry data from SONiC devices.
- Can be configured to push data to InfluxDB for storage and visualization.
2. InfluxDBย
- A high-performance time-series database designed to handle large volumes of real-time data.
- Efficiently stores telemetry data collected from network devices.
- Supports querying and analysis using InfluxQL or Flux.
3. Grafanaย ย
- An open-source visualization and monitoring tool.
- Provides dashboards for real-time and historical data analysis.
- Supports alerting and integrates well with InfluxDB.
4. gNMI (gRPC Network Management Interface)ย ย ย
- A modern network management protocol based on gRPC.
- Enables efficient and secure telemetry data collection.
- Used by SONiC to provide structured and real-time network telemetry
Advantages of This Setup
- Real-Time Monitoring: gNMI provides real-time telemetry data, ensuring up-to-date insights into network performance.
- Scalability:Telegrafโs lightweight architecture and InfluxDBโs efficient time-series storage enable scalable monitoring.
- Flexibility:Supports multiple plugins and data sources, making it adaptable for various monitoring needs.
- Efficient Data Storage:InfluxDB optimizes storage for high-frequency data, reducing overhead compared to traditional relational databases.
- Customizable Dashboards:Grafana offers extensive visualization options, making network analysis intuitive and user-friendly.
- Automation & Alerting: Grafanaโs built-in alerting allows proactive network issue detection and response.ย
Advantages of gNMI Over Other Protocols
Feature | gNMI | SNMP | NETCONF/YANG | RESTCONF |
---|---|---|---|---|
Transport | gRPC-based (binary) | UDP-based (text) | SSH-based (XML) | HTTP-based (XML/JSON) |
Performance | High (streaming support) | Low (polling-based) | Moderate (RPC-based) | Moderate (REST-based) |
Security | TLS encryption | Minimal security | Secure with SSH | Secure with TLS |
Scalability | High | Moderate | Moderate | Moderate |
Data Model | Structured (Protobuf/YANG) | Unstructured (OID) | Structured (YANG) | Structured (YANG) |
Telemetry | Streaming & Polling | Polling only | RPC-based retrieval | RPC-based retrieval |
Ease of Use | Modern & Developer-friendly | Legacy, complex | Requires XML handling | Requires REST API knowledge |
Comparison with Other Solutions
Feature | Telegraf + InfluxDB + Grafana | SNMP-based Monitoring | ELK Stack (Elasticsearch, Logstash, Kibana) |
---|---|---|---|
Real-time Data | Yes (gNMI streaming) | No (polling-based) | Limited (log-based) |
Data Efficiency | High (time-series storage) | Moderate | High (searchable logs) |
Visualization | Extensive (Grafana) | Basic | Advanced (Kibana) |
Alerting | Yes | Limited | Yes |
Scalability | High | Moderate | High |
Protocol Support | gNMI, SNMP, others | SNMP, NetFlow | Logs, Metrics, APM |
gNMI for Streaming Telemetry from Sonic Device
gNMI streaming telemetry offers an efficient alternative by continuously transmitting data from network devices with incremental updates. Instead of relying on SNMPโs polling mechanism, which collects data regardless of changes, gNMI allows operators to subscribe to specific data points using well-defined sensor identifiers. This approach provides near real-time, model-driven, and analytics-ready insights, enabling more effective network automation, traffic optimization, and proactive troubleshooting.
Telegraf Configuration
[[inputs.gnmi]]
#Address and port of the gNMI GRPC server (Update with sonic device IP)
addresses = [“:”,”:”]
#define credentials
username = “”
password = “”
#gNMI encoding requested (one of: “proto”, “json”, “json_ietf”, “bytes”)
encoding = “json”
#redial in case of failures after
redial = “10s”
#enable TLS only if any of the other options are specified (For different telegraf version it will be enable_tls = true)
tls_enable = true
#Use TLS but skip chain & host verification
insecure_skip_verify = true
#Subscription to get temperature detail
[[inputs.gnmi.subscription]]
name = “temperature_sensor”
origin = “openconfig”
path = “<url>”
sample_interval = “60s”
Note : Once Configuration has been Updated restart telegraf service i.e sudo systemctl restart telegraf
Dashboards
Strategic Takeaway
This observability stack is not just a combination of open-source tools, itโs a production-ready framework engineered for real-time visibility across SONiC environments.
By combining gNMI streaming, Telegraf, InfluxDB, and Grafana, and tuning them specifically for SONiC-based networking, PalC Networks helps organizations monitor infrastructure with precision, scalability, and speed. Weโve implemented custom telemetry paths, dashboard packs, and threshold-driven alerting systems.
If youโre adopting SONiC and planning to integrate it with a monitoring stack-reach out to us. Our team supports everything from architecture design to implementation, validation, and ongoing maintenance.
Explore Our Open Networking Capabilities
If you need support or guidance in exploring OpenStack, open networking, or data center infrastructure optimization, we are here to help.