SONiC-2

 


1.SONiC Architecture - Layers

Layer                What it does           Speed           Analogy
1.User Space        Logic, protocols   Slow              Brain
2.Kernel Space        OS networking           Medium        Nervous system
3.Hardware        Packet forwarding   Very fast        Muscles


Big Picture
User Space  →     Kernel Space  →  Hardware
(SONiC apps)      (Linux OS)            (ASIC)

a. User Space (SONiC Layer) - This is where SONiC containers + daemons run

Includes:
swss → orchagent, vlanmgrd, etc.
bgp → bgpd, zebra
teamd, lldp, snmp
Redis (database container)

Role:
Process configs
Run routing protocols
Decide what should happen

Example:
You configure VLAN → vlanmgrd
BGP learns route → bgpd
 
Think:“Decision making layer”

List of Redis-DB
0.APPL_DB (config)
Application state published by daemons (orchagent, portsyncd). Used by syncd to program the ASIC.
redis-cli -n 0 keys '*' hgetall PORT_TABLE:Ethernet0 hgetall ROUTE_TABLE:0.0.0.0/0

1.ASIC_DB (hardware)
SAI objects written by orchagent, consumed by syncd to program the ASIC hardware directly.
redis-cli -n 1 keys '*' hgetall ASIC_STATE:SAI_OBJECT_TYPE_PORT:oid:0x...

2.COUNTERS_DB (state)
Interface and queue counters polled from the ASIC. Used by CLI (show interface counters) and telemetry.
redis-cli -n 2 keys '*' hgetall COUNTERS:oid:0x1000000000003 hgetall COUNTERS_PORT_NAME_MAP

3.LOGLEVEL_DB (management)
Per-component log verbosity settings. Modified by swssconfig / log level CLI commands.
redis-cli -n 3 keys '*' hgetall LOGLEVEL:orchagent

4.CONFIG_DB (config)
Primary configuration store. Loaded from config_db.json at boot. All config CLI writes go here.
redis-cli -n 4 keys '*' hgetall PORT|Ethernet0 hgetall DEVICE_METADATA|localhost hgetall BGP_NEIGHBOR|10.0.0.1

5.PFC_WD_DB (state)
PFC Watchdog state — tracks storm detection, restoration events per queue. Used by pfcwd daemon.
redis-cli -n 5 keys '*' hgetall PFC_WD_TABLE:Ethernet0:3

6.FLEX_COUNTER_DB (state)
Flexible counter polling groups. Controls which OIDs are polled and at what interval.
redis-cli -n 6 keys '*' hgetall FLEX_COUNTER_GROUP_TABLE:PORT

7.STATE_DB (state)
Operational state published by daemons. Tracks port link state, VLAN membership, LAG status, BGP sessions.
redis-cli -n 7 keys '*' hgetall PORT_TABLE|Ethernet0 hgetall NEIGH_TABLE|Ethernet0:192.168.1.1 hgetall BGP_NEIGHBOR_TABLE|10.0.0.1

8.SNMP_OVERLAY_DB (management)
SNMP MIB overlay data — additional OIDs injected by snmpd for ifAlias, sysDescr overrides.
redis-cli -n 8 keys '*'

9.RESTAPI_DB (management)
Used by the REST API server (Sonic REST) to track request tokens and API state. Optional component.
redis-cli -n 9 keys '*'

b. Kernel Space (Linux Networking Stack) - This is the Linux OS networking layer

Includes:
Routing table (kernel FIB)
Interfaces (eth0, Ethernet0)
ARP table
Netlink sockets

SONiC interaction:
zebra writes routes → kernel
Kernel sends events → fpmsyncd, neighsyncd

Example:
bgpd → zebra → Kernel route table

Think:“Traffic control & OS networking engine”

c. Hardware (ASIC) - Actual switching chip (Broadcom in S4148F-ON)

Controlled by:
syncd → SAI → ASIC

Role:
Forward packets at line rate
Apply ACLs, QoS, VXLAN

Think:“Packet forwarding engine”



DBVs Purpose
CONFIG_DBwhat you want
APPL_DBwhat system plans
ASIC_DBwhat hardware needs
STATE_DBwhat system is currently experiencing
COUNTERS_DBmetrics/statistics


2.Netdev & Net-links

User Space                                            Kernel Space
────────────────────────────────────
ip / iproute2   ←──netlink─----─→  Routing subsystem
FRR (BGP)       ←──netlink─----─→  FIB (Forwarding table)
SONiC orchagent ←──netlink──→  netdev / interfaces

netdev — Network Device Abstraction

In SONiC context:
  • Switch front-panel ports are registered as net_device by the ASIC/network driver
  • Tools like ip link, ifconfig interact with these net_device objects
  • Control-plane packets (BGP, ARP, LLDP) flow through these interfaces to the CPU

Netlink = real-time messaging channel between Linux kernel and user-space networking apps

Analogy
Think of Netlink as: “Phone line between kernel and applications”

Apps call kernel → “add route”
Kernel calls apps → “link down!”

3.kernel space drivers


User Space (SAI / ASIC SDK / Platform Daemons)
         ↕              ↕              ↕
   ASIC Driver   Network Driver  Platform Driver
         ↕              ↕              ↕
    Switch ASIC    Linux netdev    HW (fans/PSU/SFP)
         [  K E R N E L   S P A C E ]
 

4.How Front Panel ports mapped to ASIC (Serdes lanes) - port-conf.ini


root@spine:/usr/share/sonic/device# sonic-cfggen -H -v DEVICE_METADATA.localhost.platform
x86_64-kvm_x86_64-r0

root@spine:/usr/share/sonic/device# sonic-cfggen -d -v DEVICE_METADATA.localhost.hwsku
Force10-S6000

root@spine:/usr/share/sonic/device# cat /usr/share/sonic/device/x86_64-kvm_x86_64-r0/Force10-S6000/port_config.ini
# name          lanes                 alias             index       speed
Ethernet4       29,30,31,32       fortyGigE0/4      1           40000
Ethernet0       25,26,27,28       fortyGigE0/0      0           40000

** SONiC uses the name field (Ethernet0, Ethernet4) — NOT the alias.

Why name is Used-

The name column is the Linux netdev name — it's what gets registered as a net_device in the kernel. Everything in SONiC references this.


5.List of containers

  1. database — the shared memory bus (Redis). All containers talk to each other through it, never directly.
  2. swss — the orchestrator. Translates human/protocol intent → ASIC_DB entries.
  3. syncd — the hardware programmer. Reads ASIC_DB → calls SAI → hits the ASIC silicon.

Everything else (bgp, lldp, pmon, etc.) are application containers that feed state into the Redis databases, which swss then processes.



SWSS services

1️⃣ Input (sync)

  • fpmsyncd
  • neighsyncd
  • portsyncd
  • intfsyncd

2️⃣ Config handlers (mgrd)

  • portmgrd
  • vlanmgrd
  • intfmgrd
  • vrfmgrd
  • nbrmgrd
  • buffermgrd

3️⃣ Brain

  • orchagent


WHO Writes in to what table

database
Central Redis in-memory DB. Hosts CONFIG_DB, APP_DB, STATE_DB, ASIC_DB, COUNTERS_DB — the backbone all containers communicate through.
swss
Switch State Service. Runs orchagent, portmgrd, intfmgrd, routemgrd. Translates app-level config into ASIC_DB entries.
syncd
Synchronization daemon. Reads ASIC_DB and programs the hardware ASIC via SAI API using the vendor SDK.
bgp
FRR (Free Range Routing) stack. Handles BGP, OSPF, IS-IS, static routes. fpmsyncd pushes routes to APP_DB.
teamd
LAG/LACP management. Controls port-channel bonding and link aggregation groups using teamd daemon.
radv
IPv6 Router Advertisement daemon. Sends RA messages for IPv6 stateless address autoconfiguration (SLAAC).
lldp
Link Layer Discovery Protocol. Runs lldpd and lldp_syncd to discover neighbors and write topology info to STATE_DB.
dhcp_relay
DHCP relay agent. Forwards DHCP requests from clients to DHCP servers across different subnets.
snmp
SNMP agent (NGINX-based). Exposes switch stats, interface counters, and system info via SNMP v2/v3.
pmon
Platform Monitor. Monitors PSU, fans, temperature sensors, SFP/QSFP transceivers and writes to STATE_DB.
mgmt-framework
REST API & KLISH CLI server. Exposes OpenConfig / YANG-model-based management interface over HTTPS.
telemetry
gNMI / gRPC telemetry server. Streams real-time operational data (counters, states) to external collectors.
macsec
MACsec encryption container. Manages 802.1AE link-layer encryption for secure port-to-port communication.
nat
Network Address Translation. Handles SNAT/DNAT rules and manages NAT table entries via iptables/conntrack.
eventd
Event management daemon. Collects, caches, and distributes system events across containers via pub/sub.
sflow
sFlow agent. Samples network packets and exports flow telemetry to external sFlow collectors.
iccpd
ICCP daemon for MC-LAG. Manages Inter-Chassis Communication Protocol for multi-chassis link aggregation.
p4rt
P4Runtime container. Provides a P4Runtime gRPC API to program the ASIC pipeline using P4 programs.
database-chassis
Shared Redis DB for chassis systems. Provides cross-ASIC communication between line cards and supervisor.
gbsyncd
Gearbox synchronization daemon. Programs external gearbox ASICs (retimers, PHYs) via Gearbox SAI API.


6.Full SONiC Flow (Important)

Example: BGP Route
User Space:
   bgpd learns route
        ↓
   zebra installs in kernel
Kernel Space:
   route in Linux FIB
        ↓
   fpmsyncd picks it
User Space again:
   fpmsyncd → APPL_DB
   orchagent → ASIC_DB
Hardware:
   syncd → SAI → ASIC programmed
   
Another Example: VLAN Config
User Space:
   config vlan add 100
        ↓
   vlanmgrd → APPL_DB
   orchagent → ASIC_DB
Hardware:
   syncd → ASIC


7. Packet Flow

1. Ingress (Packet Enters Switch)
Packet arrives on physical port (e.g., Ethernet0)
Happens in:ASIC (hardware)
Actions:Parses packet (MAC/IP/VLAN)
Checks:VLAN/MAC table (FDB)/Routing table (L3)
No CPU involved yet (fast path)

2. ASIC Lookup (Fast Path)
Case A: Known traffic (normal case)
ASIC already has entry:
L2 → FDB lookup
L3 → Route lookup
Packet is forwarded directly
No kernel / user-space involvement
This is important: 99% of traffic never reaches CPU

3. When Packet Goes to CPU (Slow Path)
Packet is sent to CPU when:
Examples:ARP request/Unknown MAC/TTL expired
BGP/OSPF packets/ICMP to switch/ACL trap
Slow Path Flow
ASIC → CPU → Kernel → User-space daemon → back to ASIC

4. Kernel Space Handling
Packet reaches Linux kernel
Seen in:tcpdump -i Ethernet0
Kernel does:Basic processing
Sends event via netlink

5. User Space Processing
Depends on packet type:
Case 1: ARP Request
neighsyncd → updates Redis
orchagent → programs neighbor in ASIC
Case 2: BGP Packet
Goes to bgpd
Route learned
Sent via:zebra → kernel → fpmsyncd → APPL_DB
Case 3: Unknown MAC
Learned by orchagent
Added to FDB in ASIC

6. Back to Hardware
After processing:
orchagent → ASIC_DB → syncd → ASIC
Now ASIC can forward future packets in fast path






PUB/SUB (Publish–Subscribe) model vs   Producer/Consumer model

1. Pub/Sub (Publish–Subscribe)

Concept

  • One publisher sends messages
  • Multiple subscribers receive them
  • No persistence (messages are transient)

In SONiC

Used mainly for event notifications

Example

  • Link status change (port up/down)
  • Interface events
  • State changes

A component publishes:

"Ethernet0 -> down"

Subscribers (multiple) react:

  • Routing daemon updates routes
  • Monitoring service logs it
  • Orchagent reacts

Characteristics

  • 🔁 One-to-many
  • ⚡ Real-time notifications
  • ❌ No storage (if you miss it, it's gone)

2. Producer/Consumer (Queue-based)

Concept

  • Producer pushes data into a queue/table
  • Consumer reads and processes it
  • Data is stored until consumed

In SONiC

Used for state/config propagation and orchestration

Example flow

  1. CONFIG_DB updated:

    VLAN100 created
  2. Producer writes to APPL_DB
  3. Orchagent (consumer) reads it
  4. Converts to ASIC_DB entries
  5. syncd programs hardware

Characteristics

  • 📦 Persistent (stored in Redis tables)
  • 🎯 Reliable processing
  • 🔄 One-to-one (typically)
  • 🧠 Ordered processing


FeaturePub/SubProducer/Consumer
Data storage          ❌ No     ✅ Yes (Redis tables)
Reliability        Low (missed events lost)      High
Use case        Events/notifications.        Config/state processing
Pattern        One-to-many      Usually one-to-one
Example        Port status change      VLAN/route programming


SONIC RedisDB retention not equal to KAFKA?

Retention = “until consumed or overwritten”
No built-in time-based or size-based retention policy


if in-memory means on ram then if the switch reboots then contents on the tables are lost ?

Short answer: yes — but with an important caveat.

🔹 What happens to Redis data on reboot?

Since Redis stores data in RAM:

  • When the switch reboots → RAM is cleared
  • So all Redis DBs (APPL_DB, STATE_DB, ASIC_DB, COUNTERS_DB) are lost

🔹 But SONiC doesn’t “forget” everything

The key exception is:

👉 CONFIG_DB is persisted

  • SONiC saves configuration to disk (in /etc/sonic/config_db.json)
  • On reboot:
    1. Redis starts empty
    2. CONFIG_DB is reloaded from disk into Redis
    3. Services restart and rebuild the entire state

🔹 What gets rebuilt after reboot

After reboot, SONiC reconstructs everything:

  • swss reads CONFIG_DB → repopulates APPL_DB
  • syncd → reprograms ASIC via ASIC_DB
  • Protocols (BGP, LLDP, etc.) restart and repopulate state

So:

  • ❌ Runtime state → lost
  • ✅ Intended config → restored
  • 🔄 System → rebuilt automatically

🔹 Think of it like this

  • CONFIG_DB = saved blueprint (on disk)
  • Other DBs = live working memory (RAM)

After reboot:

Blueprint is reloaded → system rebuilds itself


🔹 Important nuance (interview-level insight)

Even though Redis can support persistence (RDB/AOF), SONiC:

  • does not rely on Redis persistence for correctness
  • instead relies on replay from CONFIG_DB

This design:

  • avoids stale/inconsistent state
  • ensures clean initialization every boot

🔹 Real-world impact

  • Counters reset after reboot
  • BGP sessions re-establish
  • Interfaces reconfigure from saved config
  • Temporary states (like learned MACs) are relearned

Comments

Popular posts from this blog

eBGP sonic lab + Ansible config & validation

RDMA RoCE