1.SONiC Architecture - Layers
Layer What it does Speed Analogy
1.User Space Logic, protocols Slow Brain
2.Kernel Space OS networking Medium Nervous system
3.Hardware Packet forwarding Very fast Muscles
Big Picture
User Space → Kernel Space → Hardware
(SONiC apps) (Linux OS) (ASIC)
a. User Space (SONiC Layer) - This is where SONiC containers + daemons run
Includes:
swss → orchagent, vlanmgrd, etc.
bgp → bgpd, zebra
teamd, lldp, snmp
Redis (database container)
Role:
Process configs
Run routing protocols
Decide what should happen
Example:
You configure VLAN → vlanmgrd
BGP learns route → bgpd
Think:“Decision making layer”
List of Redis-DB
0.APPL_DB (config)
Application state published by daemons (orchagent, portsyncd). Used by syncd to program the ASIC.
redis-cli -n 0 keys '*' hgetall PORT_TABLE:Ethernet0 hgetall ROUTE_TABLE:0.0.0.0/0
1.ASIC_DB (hardware)
SAI objects written by orchagent, consumed by syncd to program the ASIC hardware directly.
redis-cli -n 1 keys '*' hgetall ASIC_STATE:SAI_OBJECT_TYPE_PORT:oid:0x...
2.COUNTERS_DB (state)
Interface and queue counters polled from the ASIC. Used by CLI (show interface counters) and telemetry.
redis-cli -n 2 keys '*' hgetall COUNTERS:oid:0x1000000000003 hgetall COUNTERS_PORT_NAME_MAP
3.LOGLEVEL_DB (management)
Per-component log verbosity settings. Modified by swssconfig / log level CLI commands.
redis-cli -n 3 keys '*' hgetall LOGLEVEL:orchagent
4.CONFIG_DB (config)
Primary configuration store. Loaded from config_db.json at boot. All config CLI writes go here.
redis-cli -n 4 keys '*' hgetall PORT|Ethernet0 hgetall DEVICE_METADATA|localhost hgetall BGP_NEIGHBOR|10.0.0.1
5.PFC_WD_DB (state)
PFC Watchdog state — tracks storm detection, restoration events per queue. Used by pfcwd daemon.
redis-cli -n 5 keys '*' hgetall PFC_WD_TABLE:Ethernet0:3
6.FLEX_COUNTER_DB (state)
Flexible counter polling groups. Controls which OIDs are polled and at what interval.
redis-cli -n 6 keys '*' hgetall FLEX_COUNTER_GROUP_TABLE:PORT
7.STATE_DB (state)
Operational state published by daemons. Tracks port link state, VLAN membership, LAG status, BGP sessions.
redis-cli -n 7 keys '*' hgetall PORT_TABLE|Ethernet0 hgetall NEIGH_TABLE|Ethernet0:192.168.1.1 hgetall BGP_NEIGHBOR_TABLE|10.0.0.1
8.SNMP_OVERLAY_DB (management)
SNMP MIB overlay data — additional OIDs injected by snmpd for ifAlias, sysDescr overrides.
redis-cli -n 8 keys '*'
9.RESTAPI_DB (management)
Used by the REST API server (Sonic REST) to track request tokens and API state. Optional component.
redis-cli -n 9 keys '*'
b. Kernel Space (Linux Networking Stack) - This is the Linux OS networking layer
Includes:
Routing table (kernel FIB)
Interfaces (eth0, Ethernet0)
ARP table
Netlink sockets
SONiC interaction:
zebra writes routes → kernel
Kernel sends events → fpmsyncd, neighsyncd
Example:
bgpd → zebra → Kernel route table
Think:“Traffic control & OS networking engine”
c. Hardware (ASIC) - Actual switching chip (Broadcom in S4148F-ON)
Controlled by:
syncd → SAI → ASIC
Role:
Forward packets at line rate
Apply ACLs, QoS, VXLAN
Think:“Packet forwarding engine”
| ASIC_DB | what hardware needs |
| STATE_DB | what system is currently experiencing |
| COUNTERS_DB | metrics/statistics |
2.Netdev & Net-links
User Space Kernel Space
────────────────────────────────────
ip / iproute2 ←──netlink─----─→ Routing subsystem
FRR (BGP) ←──netlink─----─→ FIB (Forwarding table)
SONiC orchagent ←──netlink──→ netdev / interfaces
netdev — Network Device Abstraction
In SONiC context:
- Switch front-panel ports are registered as net_device by the ASIC/network driver
- Tools like ip link, ifconfig interact with these net_device objects
- Control-plane packets (BGP, ARP, LLDP) flow through these interfaces to the CPU
Netlink = real-time messaging channel between Linux kernel and user-space networking apps
Analogy
Think of Netlink as: “Phone line between kernel and applications”
Apps call kernel → “add route”
Kernel calls apps → “link down!”
3.kernel space drivers
User Space (SAI / ASIC SDK / Platform Daemons)
↕ ↕ ↕
ASIC Driver Network Driver Platform Driver
↕ ↕ ↕
Switch ASIC Linux netdev HW (fans/PSU/SFP)
[ K E R N E L S P A C E ]
4.How Front Panel ports mapped to ASIC (Serdes lanes) - port-conf.ini
root@spine:/usr/share/sonic/device# sonic-cfggen -H -v DEVICE_METADATA.localhost.platform
x86_64-kvm_x86_64-r0
root@spine:/usr/share/sonic/device# sonic-cfggen -d -v DEVICE_METADATA.localhost.hwsku
Force10-S6000
root@spine:/usr/share/sonic/device# cat /usr/share/sonic/device/x86_64-kvm_x86_64-r0/Force10-S6000/port_config.ini
# name lanes alias index speed
Ethernet4 29,30,31,32 fortyGigE0/4 1 40000
Ethernet0 25,26,27,28 fortyGigE0/0 0 40000
** SONiC uses the name field (Ethernet0, Ethernet4) — NOT the alias.
Why name is Used-
The name column is the Linux netdev name — it's what gets registered as a net_device in the kernel. Everything in SONiC references this.
5.List of containers
- database — the shared memory bus (Redis). All containers talk to each other through it, never directly.
- swss — the orchestrator. Translates human/protocol intent → ASIC_DB entries.
- syncd — the hardware programmer. Reads ASIC_DB → calls SAI → hits the ASIC silicon.
Everything else (bgp, lldp, pmon, etc.) are application containers that feed state into the Redis databases, which swss then processes.
SWSS services
1️⃣ Input (sync)
-
fpmsyncd
-
neighsyncd
-
portsyncd
-
intfsyncd
2️⃣ Config handlers (mgrd)
-
portmgrd
-
vlanmgrd
-
intfmgrd
-
vrfmgrd
-
nbrmgrd
-
buffermgrd
3️⃣ Brain
WHO Writes in to what table
🔥 Big picture first
There are 3 main DB layers:
-
CONFIG_DB → user intent (persistent)
-
APPL_DB → processed/app-ready state
-
ASIC_DB → hardware-ready objects
🔹 1️⃣ Manager daemons (*mgrd)
👉 Convert config → application state
| Service | Reads from | Writes to |
|---|
portmgrd | CONFIG_DB (PORT) | APPL_DB (PORT_TABLE) |
vlanmgrd | CONFIG_DB (VLAN, VLAN_MEMBER) | APPL_DB (VLAN_TABLE) |
intfmgrd | CONFIG_DB (INTERFACE) | APPL_DB (INTF_TABLE) |
vrfmgrd | CONFIG_DB (VRF) | APPL_DB (VRF_TABLE) |
nbrmgrd | CONFIG_DB (STATIC_NEIGH) | APPL_DB (NEIGH_TABLE) |
buffermgrd | CONFIG_DB (BUFFER_*) | APPL_DB (BUFFER tables) |
👉 Think:
CONFIG_DB → mgrd → APPL_DB
🔹 2️⃣ Sync daemons (*syncd)
👉 Import runtime/external state
| Service | Source | Writes to |
|---|
fpmsyncd | FRRouting (BGP/OSPF routes) | APPL_DB (ROUTE_TABLE) |
neighsyncd | Kernel (ARP/ND) | APPL_DB (NEIGH_TABLE) |
portsyncd | Kernel (netlink) | APPL_DB (PORT_TABLE) |
intfsyncd | Kernel | APPL_DB (INTF_TABLE) |
👉 Think:
External world → syncd → APPL_DB
🔹 3️⃣ orchagent (the brain)
👉 Converts application state → ASIC state
| Reads from (APPL_DB) | Writes to (ASIC_DB) |
|---|
| ROUTE_TABLE | ASIC_STATE:SAI_OBJECT_TYPE_ROUTE_ENTRY |
| NEIGH_TABLE | ASIC_STATE:SAI_OBJECT_TYPE_NEIGHBOR_ENTRY |
| PORT_TABLE | ASIC_STATE:SAI_OBJECT_TYPE_PORT |
| VLAN_TABLE | ASIC_STATE:SAI_OBJECT_TYPE_VLAN |
| INTF_TABLE | ASIC_STATE:SAI_OBJECT_TYPE_ROUTER_INTERFACE |
| VRF_TABLE | ASIC_STATE:SAI_OBJECT_TYPE_VIRTUAL_ROUTER |
👉 Think:
APPL_DB → orchagent → ASIC_DB
🔹 4️⃣ syncd (hardware layer)
| Reads from | Writes to |
|---|
| ASIC_DB | ASIC (via SAI) |
👉 No Redis write-back in normal flow (except notifications/counters)
Core infrastructure
Central Redis in-memory DB. Hosts CONFIG_DB, APP_DB, STATE_DB, ASIC_DB, COUNTERS_DB — the backbone all containers communicate through.
Switch State Service. Runs orchagent, portmgrd, intfmgrd, routemgrd. Translates app-level config into ASIC_DB entries.
Synchronization daemon. Reads ASIC_DB and programs the hardware ASIC via SAI API using the vendor SDK.
Routing & control plane
FRR (Free Range Routing) stack. Handles BGP, OSPF, IS-IS, static routes. fpmsyncd pushes routes to APP_DB.
LAG/LACP management. Controls port-channel bonding and link aggregation groups using teamd daemon.
IPv6 Router Advertisement daemon. Sends RA messages for IPv6 stateless address autoconfiguration (SLAAC).
Discovery & topology
Link Layer Discovery Protocol. Runs lldpd and lldp_syncd to discover neighbors and write topology info to STATE_DB.
DHCP relay agent. Forwards DHCP requests from clients to DHCP servers across different subnets.
Management & monitoring
SNMP agent (NGINX-based). Exposes switch stats, interface counters, and system info via SNMP v2/v3.
Platform Monitor. Monitors PSU, fans, temperature sensors, SFP/QSFP transceivers and writes to STATE_DB.
REST API & KLISH CLI server. Exposes OpenConfig / YANG-model-based management interface over HTTPS.
gNMI / gRPC telemetry server. Streams real-time operational data (counters, states) to external collectors.
Security & access
MACsec encryption container. Manages 802.1AE link-layer encryption for secure port-to-port communication.
Network Address Translation. Handles SNAT/DNAT rules and manages NAT table entries via iptables/conntrack.
Advanced features
Event management daemon. Collects, caches, and distributes system events across containers via pub/sub.
sFlow agent. Samples network packets and exports flow telemetry to external sFlow collectors.
ICCP daemon for MC-LAG. Manages Inter-Chassis Communication Protocol for multi-chassis link aggregation.
P4Runtime container. Provides a P4Runtime gRPC API to program the ASIC pipeline using P4 programs.
Multi-ASIC / chassis (optional)
Shared Redis DB for chassis systems. Provides cross-ASIC communication between line cards and supervisor.
Gearbox synchronization daemon. Programs external gearbox ASICs (retimers, PHYs) via Gearbox SAI API.
6.Full SONiC Flow (Important)
Example: BGP Route
User Space:
bgpd learns route
↓
zebra installs in kernel
Kernel Space:
route in Linux FIB
↓
fpmsyncd picks it
User Space again:
fpmsyncd → APPL_DB
orchagent → ASIC_DB
Hardware:
syncd → SAI → ASIC programmed
Another Example: VLAN Config
User Space:
config vlan add 100
↓
vlanmgrd → APPL_DB
orchagent → ASIC_DB
Hardware:
syncd → ASIC
7. Packet Flow
1. Ingress (Packet Enters Switch)
Packet arrives on physical port (e.g., Ethernet0)
Happens in:ASIC (hardware)
Actions:Parses packet (MAC/IP/VLAN)
Checks:VLAN/MAC table (FDB)/Routing table (L3)
No CPU involved yet (fast path)
2. ASIC Lookup (Fast Path)
Case A: Known traffic (normal case)
ASIC already has entry:
L2 → FDB lookup
L3 → Route lookup
Packet is forwarded directly
No kernel / user-space involvement
This is important: 99% of traffic never reaches CPU
3. When Packet Goes to CPU (Slow Path)
Packet is sent to CPU when:
Examples:ARP request/Unknown MAC/TTL expired
BGP/OSPF packets/ICMP to switch/ACL trap
Slow Path Flow
ASIC → CPU → Kernel → User-space daemon → back to ASIC
4. Kernel Space Handling
Packet reaches Linux kernel
Seen in:tcpdump -i Ethernet0
Kernel does:Basic processing
Sends event via netlink
5. User Space Processing
Depends on packet type:
Case 1: ARP Request
neighsyncd → updates Redis
orchagent → programs neighbor in ASIC
Case 2: BGP Packet
Goes to bgpd
Route learned
Sent via:zebra → kernel → fpmsyncd → APPL_DB
Case 3: Unknown MAC
Learned by orchagent
Added to FDB in ASIC
6. Back to Hardware
After processing:
orchagent → ASIC_DB → syncd → ASIC
Now ASIC can forward future packets in fast path
PUB/SUB (Publish–Subscribe) model vs Producer/Consumer model
1. Pub/Sub (Publish–Subscribe)
Concept
- One publisher sends messages
- Multiple subscribers receive them
- No persistence (messages are transient)
In SONiC
Used mainly for event notifications
Example
- Link status change (port up/down)
- Interface events
- State changes
A component publishes:
Subscribers (multiple) react:
- Routing daemon updates routes
- Monitoring service logs it
- Orchagent reacts
Characteristics
- 🔁 One-to-many
- ⚡ Real-time notifications
- ❌ No storage (if you miss it, it's gone)
2. Producer/Consumer (Queue-based)
Concept
- Producer pushes data into a queue/table
- Consumer reads and processes it
- Data is stored until consumed
In SONiC
Used for state/config propagation and orchestration
Example flow
CONFIG_DB updated:
- Producer writes to APPL_DB
- Orchagent (consumer) reads it
- Converts to ASIC_DB entries
- syncd programs hardware
Characteristics
- 📦 Persistent (stored in Redis tables)
- 🎯 Reliable processing
- 🔄 One-to-one (typically)
- 🧠 Ordered processing
| Feature | Pub/Sub | Producer/Consumer |
|---|
| Data storage | ❌ No | ✅ Yes (Redis tables) |
| Reliability | Low (missed events lost) | High |
| Use case | Events/notifications. | Config/state processing |
| Pattern | One-to-many | Usually one-to-one |
| Example | Port status change | VLAN/route programming |
SONIC RedisDB retention not equal to KAFKA?
Retention = “until consumed or overwritten”
No built-in time-based or size-based retention policy
if in-memory means on ram then if the switch reboots then contents on the tables are lost ?
Short answer: yes — but with an important caveat.
🔹 What happens to Redis data on reboot?
Since Redis stores data in RAM:
-
When the switch reboots → RAM is cleared
-
So all Redis DBs (APPL_DB, STATE_DB, ASIC_DB, COUNTERS_DB) are lost
🔹 But SONiC doesn’t “forget” everything
The key exception is:
👉 CONFIG_DB is persisted
-
SONiC saves configuration to disk (in
/etc/sonic/config_db.json)
-
On reboot:
-
Redis starts empty
-
CONFIG_DB is reloaded from disk into Redis
-
Services restart and rebuild the entire state
🔹 What gets rebuilt after reboot
After reboot, SONiC reconstructs everything:
-
swss reads CONFIG_DB → repopulates APPL_DB
-
syncd → reprograms ASIC via ASIC_DB
-
Protocols (BGP, LLDP, etc.) restart and repopulate state
So:
-
❌ Runtime state → lost
-
✅ Intended config → restored
-
🔄 System → rebuilt automatically
🔹 Think of it like this
-
CONFIG_DB = saved blueprint (on disk)
-
Other DBs = live working memory (RAM)
After reboot:
Blueprint is reloaded → system rebuilds itself
🔹 Important nuance (interview-level insight)
Even though Redis can support persistence (RDB/AOF), SONiC:
-
does not rely on Redis persistence for correctness
-
instead relies on replay from CONFIG_DB
This design:
-
avoids stale/inconsistent state
-
ensures clean initialization every boot
🔹 Real-world impact
-
Counters reset after reboot
-
BGP sessions re-establish
-
Interfaces reconfigure from saved config
-
Temporary states (like learned MACs) are relearned
Comments
Post a Comment