Networking Hardware¶
Last updated: 2026-05-29
Recent Finds¶
PCIe 8.0 Draft 0.5 Released — 1 TB/s Bidirectional at x16, Final Spec 2028 (PCI-SIG, May 7, 2026)¶
PCI-SIG released Draft 0.5 of the PCIe 8.0 specification on May 7, 2026 — the first public draft milestone, incorporating member feedback on Draft 0.3. Key specs: 256 GT/s raw bit rate, 1 TB/s bidirectional bandwidth across a ×16 link — a 2× doubling over PCIe 7.0 and 8× over PCIe 5.0 at the same lane count. Signal encoding: retains PAM4 + Flit-based forward error correction introduced in PCIe 6.0 (not a new encoding scheme), focusing bandwidth gains on the physical layer speed increase. New connector technology under evaluation to handle the higher signal integrity requirements. Final specification remains on track for 2028, giving the ecosystem ~2 years to prototype. Target applications: AI accelerators, data centers, high-speed networking, CXL-adjacent devices. Architectural significance for this wiki's interconnect thread: PCIe 8.0 at ×4 hits 256 GB/s bidirectionally — already well past current GPU-to-CPU PCIe 5.0 bandwidth. The key implication is for CXL 4.0, which will run on the PCIe 8.0 physical layer: CXL 4.0 at ×16 would provide 2 TB/s coherent memory fabric bandwidth — the order-of-magnitude step needed to make rack-scale CXL memory pooling competitive with HBM access times. Combined with Astera Labs Scorpio X-Series (PCIe 6, already in this wiki), Marvell Structera S 60260 (PCIe 6.0), and Panmnesia PANSWITCH (PCIe 6.4/CXL 3.2), the roadmap is clear: PCIe 6.x ships in 2026, PCIe 7.0 in 2027+, PCIe 8.0 in 2028 — each generation doubling fabric bandwidth for AI inference clusters.
Astera Labs Computex 2026: PCIe 6 Scale-Up Optics First Public Demo Confirmed (May 27, 2026)¶
Astera Labs announced May 27 that it will host a press conference and live demos at Computex 2026 (Taipei, June 2–5). The June 3 press conference will feature the industry-first public demonstration of PCIe 6 scale-up optics leveraging COSMOS for end-to-end optical link management — the first live, hands-on showing of the Scorpio X-Series (announced May 5) in full optics configuration. Also on demo: the complete rack-scale connectivity portfolio including Aries PCIe 5/6 smart retimers and Leo CXL memory interconnects. Significance: this is the milestone that moves the Scorpio X-Series 320-lane switch from paper announcement to demonstrated product — the PCIe 6 optics integration shows the full end-to-end scale-up fabric (switch silicon + optical PHY + link management software) working together, resolving the "how does 320 lanes of PCIe 6 actually reach across a rack?" question. Combined with the Marvell Structera S 60260 (260-lane PCIe 6.0, Q3 sampling) and Panmnesia PANSWITCH (PCIe 6.4/CXL 3.2 H2 2026 mass production), Computex will be the first event where all three competing PCIe 6 fabric switch architectures are simultaneously visible — a convergence point for the networking hardware competitive landscape.
Astera Labs Scorpio X-Series: 320-Lane PCIe 6 Memory-Semantic AI Fabric Switch (May 5, 2026)¶
Astera Labs announced the Scorpio X-Series 320 Lane Smart Fabric Switch on May 5, 2026 — the highest-radix open memory-semantic scale-up fabric switch in the market. Key architecture: 320 lanes of PCIe 6 connectivity (vs. Marvell Structera S 60260 at 260 lanes), designed to replace multiple legacy switches in a single hop, reducing both latency and rack complexity. The critical differentiator is the memory-semantic fabric: GPUs and AI accelerators access resources spread across the fabric using simple load/store operations — the entire fabric behaves as a unified memory pool, eliminating packet-translation overhead at the interconnect layer. Hardware-accelerated Hypercast and In-Network Compute engines boost collective operations (all-reduce, all-gather) by up to 2×, directly improving tokens-per-watt on AI serving and training. The companion Scorpio P-Series family has also been expanded from 32 to 320 lane configurations, giving data center architects granular PCIe fabric switch options. Production ramp: 2H 2026; first showcased at Computex 2026 (Taipei, June 2–5) with industry-first PCIe 6 scale-up optics demonstrations via COSMOS end-to-end link management. Market context: targets the merchant scale-up switch silicon market projected at $20B by 2030 — the same market Marvell (Structera S 60260, PCIe 6.0, 260 lanes, Q3 2026 sampling) and Panmnesia (PCIe 6.4/CXL 3.2 fusion, H2 2026 mass production) are contesting. Astera Labs Q1 2026 revenue hit $308.4M (+93% YoY), confirming the company's trajectory as the third major interconnect ASIC player alongside Marvell and Broadcom.
NVIDIA $2B Strategic Investment in Marvell — NVLink Fusion + AI Ethernet Scale-Up (NextPlatform, March 31, 2026)¶
NVIDIA's $2B strategic investment in Marvell (announced March 31, 2026) is the dominant SmartNIC/DPU story of Q2 2026. The deal has two tracks: (1) NVLink Fusion compatibility: Marvell will build custom XPU silicon that connects to NVIDIA's NVLink scale-up fabric — ensuring NVIDIA ecosystem lock-in even on third-party accelerators ("NVLink requires at least one NVIDIA component per platform"). (2) AI Ethernet fabric and silicon photonics: the two companies will co-develop both Ethernet-based scale-out networking and silicon photonics components for AI infrastructure. Strategic significance: Marvell is already the CXL/PCIe switching leader (Structera S 30260/60260) and custom ASIC infrastructure partner for hyperscalers (Amazon, Microsoft). The NVIDIA investment ties Marvell's interconnect portfolio to NVIDIA's ConnectX NICs, BlueField DPUs, and Spectrum-X switches into a unified rack-scale AI platform. For the DPU/SmartNIC market: this is the clearest signal yet that the AI infrastructure stack is consolidating around a few deep bilateral partnerships rather than a competitive multi-vendor market. Panmnesia's PCIe 6.4/CXL 3.2 fusion chip (already in this wiki) competes directly with Marvell's CXL portfolio — the NVIDIA backing makes Marvell's position significantly more entrenched.
Beluga: CXL-Based Memory Architecture for LLM KV-Cache (April 7, 2026)¶
Published April 7, 2026, Beluga is the first paper to directly apply CXL memory pooling to LLM inference KV-cache management — one of the most memory-bandwidth-intensive bottlenecks in large model serving. The core architecture: KV-cache tensors are disaggregated from GPU HBM into CXL-attached DDR memory pools, with intelligent prefetching and tiering policies that maintain sub-millisecond access latency for hot cache entries while dramatically expanding total cache capacity. The application is high-signal because it directly connects two major infrastructure trends: (1) CXL memory disaggregation (Marvell Structera S 30260, Panmnesia PANSWITCH — both already in this wiki) finally has a production-relevant killer app that justifies the architectural complexity; (2) LLM serving is the dominant new workload driving data center memory demand. Beluga validates that CXL switching infrastructure is not just a hyperscaler science project but addresses a concrete serving bottleneck that any organization deploying frontier models at scale will hit. Combined with the OCI-MSA optical interconnect layer (already in this wiki), CXL+CPO becomes the emerging two-layer memory-and-interconnect architecture for next-generation AI inference clusters.
Google + Intel Multi-Year AI Infrastructure Partnership: Xeon 18A + Custom IPU Co-Development (April 9, 2026)¶
Google and Intel announced a multi-year collaboration (April 9, 2026) to advance AI and cloud infrastructure, with two tracks: (1) Intel Xeon 6 CPUs built on Intel's 18A process node (fabricated at Intel's Arizona fabs) will run AI training and inference workloads across Google's data centers — a significant endorsement of Intel's foundry strategy and its most advanced process node; (2) expanded custom ASIC-based Infrastructure Processing Unit (IPU) co-development, where Intel and Google build purpose-built programmable accelerators to offload networking, storage, and security functions from host CPUs — the same SmartNIC/DPU concept but as a jointly developed ASIC rather than a merchant silicon product. The IPU relationship started in 2022 but this 2026 deal deepens it to a multi-year roadmap commitment. The architectural implication: Google's hyperscale approach is explicitly heterogeneous compute — CPUs for AI inference, GPUs/TPUs for training, and purpose-built IPUs for infrastructure offload — a three-tier compute hierarchy that reduces interference between tenant workloads and infrastructure functions. For the SmartNIC/DPU market: the Google-Intel IPU is a custom ASIC development track that runs parallel to, and potentially competitive with, merchant IPU/DPU silicon from Nvidia BlueField, Marvell Octeon, and AMD/Pensando — a signal that hyperscalers at Google's scale may increasingly develop custom IPU silicon rather than buying merchant products.
Broadcom Project Glasswing: Fully Transparent Programmable 51.2 Tbps Switch ASIC (Broadcom, April 2026)¶
Announced at Broadcom's Memory Fabric Forum in April 2026, Project Glasswing is a ground-up architectural departure from fixed-function switching ASICs. The core concept is a "glass pipeline" — a fully transparent, programmable processing architecture where every stage of the packet-forwarding pipeline is defined by operator-supplied programs written in an extended version of P4. Unlike traditional switch ASICs (Tomahawk series) where forwarding logic is hardcoded at tape-out, Glasswing exposes every processing stage to runtime modification. All stages expose their internal state to a unified telemetry framework, enabling real-time observability into how packets traverse the chip — not just aggregate counters. Performance: 51.2 Tbps aggregate switching bandwidth, with programmability adding less than 200 nanoseconds of additional latency vs. an equivalent fixed-function design. First samples ship to lead customers Q3 2026; volume production early 2027. Architectural significance: Glasswing represents the most ambitious attempt yet to unify the Cisco Silicon One G300's programmability (already in this wiki) with Broadcom's high-volume merchant silicon economics. If the latency overhead claim holds at scale, it resolves the fundamental objection to programmable switching ASICs (performance vs. flexibility) — potentially making fixed-function ASICs architecturally obsolete for any deployment that can absorb a single-digit nanosecond latency penalty. The P4-extended programming model directly accelerates deployment of advanced network functions (stateful firewalls, INT, in-network ML) without firmware update cycles. Note: Broadcom Project Glasswing (silicon) is distinct from Anthropic's "Project Glasswing" (AI cybersecurity initiative, same week).
Panmnesia PCIe 6.4/CXL 3.2 Fusion Fabric Switch — Mass Production H2 2026 (Blocks & Files, April 16, 2026)¶
South Korean fabless startup Panmnesia is pursuing two simultaneous unification efforts: (1) a PCIe 6.4 + CXL 3.2 fusion fabric switch targeting mass production H2 2026, claiming to be the only company with port-based routing support for a combined PCIe/CXL switch; and (2) a UAL (Universal Accelerator Link) + Ethernet convergence effort targeting coherent AI scale-out fabrics. The PCIe/CXL fusion is architecturally significant: PCIe 6.4 and CXL 3.2 share the same physical layer but serve different logical purposes (I/O vs. cache-coherent memory) — a single chip handling both eliminates the need for separate PCIe switches and CXL switches in an AI server rack, directly reducing latency and BOM. Mass production H2 2026 would make Panmnesia the first to ship a combined PCIe 6.4/CXL 3.2 switch at volume, ahead of Marvell's XConn-based PCIe 6.0/CXL 3.0 stack (Structera S 60260 sampling Q3 2026). The UAL/Ethernet convergence track is the longer-term play: UAL (CXL 3.0 + PCIe 6.0 + coherence) over Ethernet fabric would allow coherent memory access across rack boundaries without specialized optical interconnects — complementary to OCI-MSA (which targets die-level optics) rather than competing. Panmnesia represents the emerging tier of fabless semiconductor companies building CXL-era switching silicon that could threaten the Marvell/Broadcom duopoly in interconnect switching.
XPO MSA Hits 100-Partner Milestone: Arista's Liquid-Cooled Pluggable Optics Ecosystem Scales (Arista / Financial Content, April 6, 2026)¶
Arista's eXtra-dense Pluggable Optics (XPO) Multi-Source Agreement, launched at OFC 2026 (March 11), has expanded to over 100 member companies as of April 6, 2026 — the fastest MSA growth rate in recent optical networking history. XPO specs: each module delivers 12.8 Tbps (64 lanes × 200 Gbps), integrated cold plate handling up to 400W per module, enabling 204.8 Tbps front-panel density per OCP rack unit — 4× the density of current 1600G OSFP solutions. Marvell confirmed joining XPO MSA to accelerate innovation in AI optical modules. The critical distinction from OCI-MSA: OCI-MSA targets optical-electrical interconnect at the ASIC die interface (CPO and on-board optics interoperability between AI chip vendors), while XPO targets the pluggable module form factor and liquid-cooling thermal interface at the chassis level — they address different layers of the interconnect stack and are complementary, not competing. Production is not anticipated until 2027, but the 100-partner milestone with 40+ optical equipment producers pledged to the spec signals that XPO is becoming the pluggable successor form factor to OSFP for AI scale-up networking. Directly relevant: as CPO (Broadcom TH6, NVIDIA Spectrum-X Photonics) dominates east-west GPU fabric in H2 2026+, XPO positions as the high-density pluggable option for facilities that cannot yet adopt CPO's non-field-replaceable constraint.
OCI-MSA v1.0 Specification Published — Available at oci-msa.org (April 2026)¶
The OCI-MSA consortium (AMD/Broadcom/Meta/Microsoft/NVIDIA/OpenAI) published its v1.0 line interface specification — 24 pages, co-authored by all six founding members. This upgrades OCI-MSA from "announced" to "spec live": a binding technical reference that module and silicon vendors can now design to. Defines GEN1: 4λ × 50 Gbps NRZ = 200 Gbps/direction; GEN2: 400 Gbps/direction BiDi (up to 800 Gbps/fiber); 3.2 Tbps roadmap. The silicon-centric (not module-centric) framing is the key shift: the interface is specified at the ASIC I/O level, enabling direct ASIC-optical integration (CPO) while preserving compatibility with pluggable OBO form factors. Multi-vendor interoperability testing can now formally begin. Complements the XPO MSA above: OCI-MSA targets the die-level ASIC optical interface; XPO targets the liquid-cooled pluggable chassis interface — different layers, different production timelines.
OCI-MSA Announced at OFC 2026: AMD/Broadcom/Meta/MSFT/NVIDIA/OpenAI Standardize Optical Compute Interconnects¶
Six hyperscalers and silicon vendors jointly announced the Optical Compute Interconnect Multi-Source Agreement (OCI-MSA) at OFC 2026 (March 17–19, Los Angeles). Specs: GEN1 at 4λ × 50 Gbps NRZ (200 Gbps/direction), GEN2 at 400 Gbps/direction BiDi, roadmap scaling to 3.2 Tbps/fiber. Supports pluggable, on-board optics (OBO), and CPO form factors. Compatible with existing SerDes-based ASICs while providing a path to direct ASIC integration. This is the OIF CPO standard convergence event the wiki has been tracking — but it arrived via a hyperscaler-driven MSA rather than via OIF itself. The practical implication: OIF's CPO specifications were too slow and too narrowly mechanical; the hyperscalers who actually need this interoperability at scale created their own binding MSA. The OCI-MSA is now the de-facto industry baseline for optical compute interconnect interoperability, complementing (not replacing) OIF's coherent module standards. Directly confirms NVIDIA Spectrum-X Photonics and Broadcom Taurus-based modules will share an interoperability framework rather than diverging into proprietary islands.
Marvell Structera S 60260: 260-Lane PCIe 6.0 Switch for AI Scale-Up Infrastructure (Marvell, OFC 2026)¶
Announced at OFC 2026 (March 2026), the Structera S 60260 is the industry's first 260-lane PCIe 6.0 switch — delivering 2× the lane density of competing products and targeting AI data center scale-up interconnects. This is the PCIe 6.0 sibling to the Structera S 30260 (CXL 3.0, same lane count — see below); both share pin-compatibility, enabling a single hardware platform to run either PCIe or CXL workloads. Engineering test samples are available now; customer sampling confirmed for Q3 2026. The product is built on technology from Marvell's XConn Technologies acquisition ($540M, 2026, already in this wiki). Architectural significance: PCIe 6.0 doubles the raw bandwidth of PCIe 5.0 (64 GT/s per lane vs. 32 GT/s) while maintaining backward compatibility — the 60260 makes it possible to build 260-port PCIe 6.0 switch fabrics for AI training clusters without requiring PCIe 5.0→6.0 silicon upgrades at the endpoint (GPUs/accelerators still talk PCIe 5.0 to the switch's upstream ports). Combined with the Structera S 30260 (CXL 3.0) for memory disaggregation, Marvell now holds the most complete CXL+PCIe switching portfolio in the market — directly competitive with Panmnesia's PCIe 6.4/CXL 3.2 fusion chip (already in this wiki), which ships H2 2026 but has not yet confirmed 260-lane density.
Marvell Structera S 30260: 260-Lane CXL 3.0 Switch for Rack-Level Memory Pooling (Marvell, OFC 2026)¶
Announced at OFC 2026 (March 17), the Structera S 30260 is a 260-lane CXL 3.0 switch with 4 TB/s aggregate bandwidth enabling dynamic rack-level memory pooling across CPUs, GPUs, and XPUs without hardware replacement. Sampling begins Q3 2026. The prior-generation Structera S 20256 (CXL 2.0) is already in production. Architectural significance: CXL 3.0 enables peer-to-peer coherent device-to-device memory access (not just host-mediated) — the Structera S 30260 extends this to multi-host, multi-device pools at rack scale. For AI training clusters: memory disaggregation at CXL 3.0 fidelity means GPUs can access a shared pool of HBM-equivalent memory across the rack, not just their own on-package HBM. Combined with the Marvell XConn acquisition (PCIe 6/CXL 3.1 switch), Marvell now holds the most complete CXL-to-memory-pooling stack in the market. The Q3 2026 sampling timeline is the most concrete CXL 3.x product milestone in the industry — directly closes the CXL 3.x adoption open question in this wiki.
Marvell Acquires XConn Technologies ($540M): Only Production CXL+PCIe Switch on a Single Die (Marvell, 2026)¶
Marvell closed its $540M acquisition of XConn Technologies in early 2026, gaining the only known production-shipping hybrid CXL+PCIe switch on a single die. XConn's portfolio: PCIe 5/CXL 2.0 switch in production today; PCIe 6/CXL 3.1 currently sampling. This closes the last gap in Marvell's rack-scale AI connectivity stack — it already leads in optical DSP (PAM4 coherent transceivers) and DPU silicon (Octeon), and now holds the CXL switching layer. The XConn acquisition is architecturally significant: CXL 3.x enables coherent, many-to-many memory sharing across compute nodes at the rack level, allowing GPU/accelerator HBM to be pooled and accessed by multiple hosts. At current AI training cluster scale (multi-thousand GPU racks), this changes the memory provisioning model — disaggregated memory pools replace per-node HBM as the cost-optimization lever. Combined with NVIDIA's Spectrum-X (optical) and Broadcom's Tomahawk 6 (CPO switching), the 2026 hyperscale interconnect picture is: optics dominate east-west GPU fabric (Broadcom/NVIDIA), while CXL switches (Marvell/XConn) handle north-south memory disaggregation. The $540M price reflects XConn holding the only production-ready device in this space.
Broadcom Taurus BCM83640: 3nm, Industry-First 400G/Lane Optical DSP — Full Specs (ServeTheHome, March 2026)¶
Deep-dive on the Broadcom BCM83640 optical PAM4 DSP, fabricated on a 3nm process: a monolithic 8-to-4 gearbox PHY and driver mapping 8×200G electrical lanes to 4×400G optical lanes, doubling throughput versus the previous 200G/lane generation. A single 1RU switch using 1.6T pluggable modules built on this DSP delivers 102.4T switching capacity — double TH5-Bailly equivalent. The 3.2T roadmap (dual-port 400G/lane) will support 204.8T switching fabrics, resolving the networking bottleneck for AI clusters above 100,000 GPUs. The design directly complies with current IEEE standards and interoperates with Broadcom's own 400G electro-absorption modulated laser (EML) and photodiode components, simplifying supply chain qualification for module makers. Early-access sampling started at OFC 2026 (March 11–13, Los Angeles); mass production targeted late 2026. The 3nm node versus previous 5nm/7nm DSPs delivers ~30% power reduction per bit at 400G/lane — critical when optical DSP power is the dominant contributor to pluggable module thermal budget. This is the DSP that will underpin 1.6T and 3.2T pluggable transceivers from module vendors including Eoptolink, Innolight, and others building to the Broadcom reference design.
HyperLight 400G/Lane TFLN PICs on Chiplet Platform for Next-Gen AI Interconnects (HyperLight / SocialNews, March 2026)¶
HyperLight announced a 400G-per-lane TFLN photonic integrated circuit (PIC) family on its Chiplet platform — designed as a drop-in modulator for AI networking transceivers, delivering low insertion loss, low drive voltage (~1V Vπ, vs. ~5V for silicon photonics), and exceptional electro-optic bandwidth. These are the modulators intended for the HyperLight/UMC/Wavetek HVM line (announced March 11). The PEC demonstration (1.6T-DR8 reference transceiver at 20W, ~20% lower than silicon photonics alternatives) validates that TFLN can compete on system-level power even after accounting for assembly overhead. The strategic significance: at 400G/lane, the Pockels-effect TFLN modulator's speed advantage over silicon photonics' plasma-dispersion modulators becomes decisive — silicon photonics EO bandwidth degrades approaching 400G/lane while TFLN maintains headroom. This is the technical moat enabling TFLN to capture a share of the 1.6T modulator market even as silicon photonics dominates current shipping volumes. Market forecast: >100M units of 1.6T/3.2T transceivers over 5 years — TFLN targets a slice of the modulator supply, not the full module.
HyperLight + UMC + Wavetek: TFLN Chiplet Platform Enters High-Volume Manufacturing (March 11, 2026)¶
The most concrete TFLN volume-manufacturing milestone to date: a three-party partnership announced March 11, 2026. HyperLight architects the TFLN Chiplet platform; Wavetek brings the technology from lab to a customer-qualified HVM line on 6-inch CMOS wafers; UMC adds 8-inch (200mm) production capacity, reducing per-die cost and increasing throughput for AI and cloud interconnect scale. A parallel Jabil partnership adds high-volume assembly expertise for hyperscale data center deployment. Target: 1.6T module electro-optic modulator supply chains, with 3.2T on the roadmap. This is the first time a TFLN photonics platform has been simultaneously qualified across both 6-inch and 8-inch CMOS foundry lines — the dual-node qualification allows volume ramp on 6-inch while 8-inch drives cost reduction. Combined with G&H/CCRAFT (wafer supply) and Raytheon/AFRL (defense supply), the TFLN production ecosystem in 2026 has cleared every scale-up barrier: technology readiness, defense funding, commercial volume driver, and now multi-node HVM foundry qualification.
Raytheon + AFRL: U.S. Domestic TFLN Wafer Production Line for Defense & Commercial (RTX, Feb 2026)¶
The U.S. Air Force Research Laboratory awarded Raytheon a contract (announced Feb 17, 2026) to establish a domestic thin-film lithium niobate (TFLN) wafer production line — a direct national security response to the fact that a single Chinese TFLN manufacturer currently dominates the international market. Raytheon will leverage its ion-slicing expertise in collaboration with G&H's Cleveland, Ohio facility, with G&H manufacturing TFLN wafers at low-rate initial production starting early 2026. The model is explicitly merchant-supplier: open, third-party access to TFLN wafers for any U.S. defense contractor or commercial customer — not a captive supply. This is the supply-chain complement to the technology milestones already tracked (CCRAFT TFLN foundry, G&H LRIP): it adds a defense-backed rationale and government contract structure that de-risks continued TFLN investment across the ecosystem. Applications targeted: advanced sensing and communications for defense, AI datacenter compute, and telecom. The TFLN wafer market is also transitioning from 4"/6" wafers to 8-inch (200mm) wafers in 2026, reducing per-die cost and enabling higher production volumes needed for the hyperscale transceiver market. Synthesis: TFLN has now cleared every scale-up barrier — technology readiness (CCRAFT, G&H), defense funding (AFRL/Raytheon), and commercial volume driver (Broadcom Taurus 400G/lane, NVIDIA Spectrum-X Photonics CPO).
Eoptolink 400G-per-Lambda 1.6T DR4 Transceiver at OFC 2026 (PR Newswire)¶
Eoptolink demonstrated a 1.6T DR4 OSFP transceiver using an 8:4 PAM4 DSP that maps an 8×200G electrical interface to a 4×400G optical interface, halving the required fiber count versus prior 1.6T DR8 architectures. This is a direct competitor approach to Broadcom's Taurus BCM83640 (3nm, first 400G/lane DSP announced at OFC 2026), with both targeting the same 1.6T and eventual 3.2T module market. The key competitive dynamic: Broadcom's Taurus is a DSP-only chip requiring module makers to build around it, while Eoptolink is shipping a complete integrated transceiver. Industry forecasts project over 100 million units of 1.6T/3.2T transceivers shipped over the next five years, with roughly half using 400G-per-lane optics. The DR4 architecture (4 fiber pairs instead of DR8's 8) matters for hyperscale deployments where fiber count in switch cabling is a real operational constraint.
Thin-Film Lithium Niobate (TFLN) Enters Low-Rate Production: G&H + CCRAFT (G&H, CSEM, 2026)¶
G&H's Cleveland, Ohio facility is transitioning from process development into low-rate initial production of TFLN wafers in early 2026, taking over from QCi's Tempe, Arizona fab (commissioned Q1 2025, rated >$180M annual capacity). Separately, CSEM's CCRAFT spinoff (https://optics.org/news/16/5/19) claims to be the first production-ready pure-play TFLN foundry on 150mm wafers, already delivering pilot production. A self-coherent TFLN transceiver architecture (dual-polarization 56 Gbaud 16-QAM over 20 km) has been published in JLT, establishing TFLN's coherent datacom credentials. Why TFLN matters: it offers far lower modulation voltage (~1V Vπ vs. ~5V for silicon photonics), enabling ultra-low-power coherent transceivers and Pockels-effect modulators at bandwidths beyond 100 GHz — critical for the post-800G era. The production infrastructure is finally catching up to the research promise: TFLN is no longer lab-only.
Broadcom TH6-Davisson Ships: 102.4 Tbps CPO Switch in Production (March 2026)¶
Tomahawk 6 "Davisson" (BCM78919) entered general production in March 2026 (early-access shipments from Oct 2025). Third-generation CPO: 16 field-replaceable optical engines via TSMC's COUPE process, each at 200 Gbps/channel, aggregate 102.4 Tb/s — double TH5-Bailly. Power: 3.5W per 800G port, 36% lower than TH5 and >70% lower than pluggable transceivers. At hyperscale scale, Broadcom projects $1.13B in 5-year power savings for a 100,000-XPU deployment. Meta Platforms' reliability testing: zero link flaps across 1 million device hours. Broadcom has already shipped >50,000 TH5-Bailly units — CPO is commercially deployed at scale, not merely sampled. A fourth-generation CPO targeting 400 Gbps/lane is in development.
OFC 2026: Open CPX MSA, 400G PAM-4 DSP, Multi-Vendor 1.6T Interop¶
OFC 2026 marked the formation of the Open CPX MSA — a new multi-source agreement to standardize co-packaged optics specs across vendors, filling the gap the OIF had not fully addressed for CPO mechanicals. Broadcom unveiled the Taurus BCM83640, the industry's first 400G-per-lane PAM-4 DSP, halving lane count for 1.6T modules (4 lanes instead of 8). The Ethernet Alliance ran a 40-company multi-vendor 1.6T interoperability demonstration. OIF's largest-ever showcase: ~100 coherent modules from 15 vendors across 11 platforms, covering 400ZR, 800ZR, CEI-448G/CEI-224G, CMIS, and Energy Efficient Interfaces. Thin-Film Lithium Niobate (TFLN) cleared research status but complete TFLN transceivers remain unavailable at production volumes — silicon photonics still dominates.
CPO Economics 2026: Power and Cost vs. Pluggables¶
The economic divide between CPO and pluggables in 2026 is primarily about deployment context. For scale-up AI clusters (east-west XPU fabric): CPO wins decisively — 3.5W vs. ~15-16W per 800G port (3.5× efficiency). At cluster scale, that differential becomes billion-dollar territory over a 5-year asset life. For scale-out AI fabric (spine-leaf, multi-tenant): 800G LPO or pluggables remain dominant in 2026 due to multi-vendor supply, operational familiarity, and simpler failure-replacement workflows. CPO's field-replaceable optical engine design helps but doesn't yet match pluggable simplicity. Market forecast: CPO at $840M–$1.05B by 2032, ~28–30% CAGR, with hyperscale early deployment through 2027 then broader commercialization as silicon photonics integration matures.
NVIDIA Spectrum-X Photonics: Co-Packaged Optics Switches for Million-GPU AI Factories (H2 2026)¶
NVIDIA's Spectrum-X Photonics marks the first major commercial deployment of co-packaged optics (CPO) in a switch ASIC from a dominant vendor. Key specs: 100 Tb/s (128×800G or 512×200G) and 400 Tb/s configurations, with optical engines integrated directly on-die — eliminating electrical traces between ASIC and optical transceiver. Claimed benefits: 3.5× power efficiency, 10× resiliency, 4× fewer lasers vs. pluggable optics, 1.3× faster deployment. Available H2 2026. The silicon photonics ecosystem (TSMC, Coherent, Lumentum) is now productized around NVIDIA's demand signal. Broader implication: CPO will become the de-facto interconnect standard for 51.2T+ AI switches. IDTechEx projects the CPO market to grow at 37% CAGR, reaching $20B by 2036.
Broadcom Tomahawk 5-Bailly: 50,000+ CPO Switches Shipped; Tomahawk 6 "Davisson" at 200G/lane Previewed (Apr 2026)¶
As of early April 2026, Broadcom has shipped over 50,000 Tomahawk 5-Bailly CPO switches — establishing CPO as commercially deployed at hyperscale, not merely sampled. Tomahawk 5 specs: 100G/lane optics, 30%+ system-level power savings vs. pluggable modules. The third-gen Tomahawk 6 "Davisson" is previewed as the industry's first 102.4 Tbps switch with 200G/lane optical interfaces integrated via TSMC's COUPE packaging. Simultaneously, Marvell's "Photonic Fabric" is being re-rated by analysts as an optical networking positioning play for 1.6T interconnects. Market implication: the pluggable transceiver era for top-of-rack AI cluster switches is over — CPO is the shipping baseline. The competitive battleground has shifted to 200G/lane gen-3 platforms and packaging yield. Together with NVIDIA Spectrum-X Photonics (400 Tb/s, H2 2026), this confirms that both major switch ASIC vendors have fully committed to CPO.
Co-Packaged Optics in 2026: From Demo to Commercial (EDN)¶
State-of-the-market analysis confirming CPO has reached an inflection point: two of the largest switch ASIC vendors are now shipping or sampling first-generation CPO products (NVIDIA and Broadcom both announced). The transition from pluggable (QSFP-DD, OSFP) to co-packaged resolves the "last inch" power problem — electrical traces from cage to ASIC consume significant power and limit bandwidth density at 51.2T+. Technical challenges: laser reliability (MTBF requirements vs. CPO accessibility for repairs), thermal management of integrated photonics, and standardization (OIF CPO specification still evolving). Key tradeoff: CPO is not field-replaceable — a dead laser may mean replacing the entire switch linecard.
Cisco Silicon One G300 — Next-Gen ASIC at Cisco Live EMEA 2026¶
Cisco unveiled the Silicon One G300 ASIC supporting up to 102.4 Tb/s switching capacity with programmable pipeline support. Significant market signal: programmable networking silicon grew from <3% market share in 2020 to over 18% by 2026 — fixed-function ASICs are being displaced. The G300 targets both core routing (replacing ASR 9000-era linecards) and high-density data center switching, with a unified architecture reducing the traditional router/switch product divide.
P4 + DPDK on SmartNICs: Performance Benchmarking (iWave Global)¶
Technical deep-dive into combining P4 programmable pipelines with DPDK at the SmartNIC layer. Key result: equivalent packet processing functionality in ~5,000 lines of P4 vs. millions of lines of C with DPDK alone — dramatic reduction in implementation complexity. The hybrid approach (P4 for match-action pipeline, DPDK for exception path and control plane) is becoming the architecture of choice for building software-defined data center switches without firmware update cycles.
Montage Technology PCIe 6.x / CXL 3.x Active Electrical Cable Solution (January 2026)¶
Montage's AEC announcement for PCIe 6.x/CXL 3.x interconnects addresses the emerging need for coherent memory pooling in disaggregated AI infrastructure. PCIe 8.0 standardization (256 GT/s per lane, ~1 TB/s bidirectional on x16) is in progress. Marvell's $540M acquisition of XConn signals aggressive investment in CXL switching silicon — enabling many-to-many memory sharing across compute nodes in a rack. CXL.mem is becoming the key protocol for GPU/accelerator memory expansion.
Panmnesia PANSWITCH: PCIe 6.4 / CXL 3.2 Fusion Fabric Switch — Mass Production H2 2026 (April 15, 2026)¶
Panmnesia announced April 15, 2026 it will mass-produce the PANSWITCH — the world's first switch chip to fully implement the CXL 3.2 specification with Port-Based Routing (PBR) — in H2 2026, with early access partner sampling already underway. The PANSWITCH supports mesh, dragonfly, and 3D torus topologies with double-digit nanosecond latency and can pool and reassign memory, compute, and accelerators across an entire rack in milliseconds. The PCIe 6.4 side of the chip implements the latest transport layer spec; the CXL 3.2 side enables fully coherent, peer-to-peer memory sharing across disaggregated racks — not just within a single node. Architectural significance: PANSWITCH is the most direct challenge to traditional HBM stacking and NVLink for AI cluster memory bandwidth. If CXL 3.2 pooling can deliver coherent bandwidth at rack scale with double-digit ns latency, it reduces the incentive for tight GPU-to-HBM packaging (which limits modularity and repair). Combined with the Marvell XConn CXL 3.1 switch and Celestial AI optical fabric, the CXL 3.x ecosystem is now competitive with proprietary interconnects for AI cluster memory disaggregation.
Marvell Structera S PCIe 60260: Industry's First 260-Lane PCIe 6.0 Switch (OFC 2026, Q3 Sampling)¶
Marvell announced the Structera S PCIe 60260 at OFC 2026 (March 2026): the industry's first 260-lane PCIe 6.0 switch, targeting AI data center scale-up infrastructure. PCIe 6.0 doubles raw bandwidth vs. PCIe 5.0 (64 GT/s per lane, PAM4). At 260 lanes, the 60260 enables high-radix non-blocking fabric topologies for connecting accelerators, memory expanders, and storage in disaggregated AI racks without bandwidth bottlenecks. Engineering test samples were available at OFC; customer sampling begins Q3 2026. Revenue contribution expected in Marvell FY2027. Context within the Marvell portfolio: the 60260 (PCIe 6.0 switching) complements the XConn acquisition (CXL 3.1 switching) — PCIe 6.0 handles high-bandwidth device-to-device I/O fabric, while CXL 3.1 handles coherent memory-semantic operations. Together with the Celestial AI Photonic Fabric and NVIDIA NVLink Fusion integration, Marvell now offers a complete layered interconnect portfolio for next-generation AI clusters.
OCI MSA: AMD, Broadcom, Meta, Microsoft, NVIDIA, OpenAI Form Optical Compute Interconnect Standard (March 12, 2026)¶
OCI MSA (Optical Compute Interconnect Multi-Source Agreement) was formed March 12, 2026 by AMD, Broadcom, Meta, Microsoft, NVIDIA, and OpenAI — an industry consortium to create an open specification for optical interconnects in AI infrastructure. The specs: OCI GEN1 — 200 Gbps/direction (4 × 50 Gbps NRZ); OCI GEN2 — 800 Gbps/fiber. Roadmap targets 3.2 Tbps/fiber. The standard supports pluggable, on-board optics (OBO), and co-packaged optics (CPO). No interoperability test results yet — the consortium is in the specification publication phase. Strategic significance: this is the six most important AI infrastructure companies agreeing that optical interconnects need a multi-vendor open standard — essentially a response to NVIDIA's proprietary NVLink and Marvell's Photonic Fabric. An open OCI standard reduces vendor lock-in for hyperscaler AI clusters. The combination of OCI MSA (open optical) + CXL 3.2 (open coherent memory) could over time erode the moats of proprietary optical interconnects — but requires shipping products and passing interop tests first.
Marvell Completes Celestial AI Acquisition — Photonic Fabric™ Joins AI Data Center Stack (February 2, 2026)¶
Marvell completed its acquisition of Celestial AI on February 2, 2026, for $3.25B at closing ($1B cash + $2.25B stock), with earn-out milestones potentially pushing the total to $5.5B. Celestial AI brings Photonic Fabric™ — an optical interconnect technology targeting scale-up connectivity for large-scale AI deployments, specifically addressing the bandwidth and latency limits of electrical NVLink/IB interconnects at multi-rack scale. The strategic picture when combined with the $540M XConn acquisition (CXL switching, January 2026) and the NVIDIA-Marvell $2B NVLink Fusion deal (March 31, 2026): Marvell is now positioned to supply the entire AI data center interconnect stack — from within-node CXL memory pooling (XConn PCIe 6 / CXL 3.1) to rack-scale optical fabric (Celestial Photonic Fabric) to cross-fabric AI Ethernet + NVLink Fusion integration with NVIDIA ecosystems. Counterpoint Research characterizes this as Marvell being "perfectly positioned for the upcoming multi-rack scale-up boom." Revenue timeline: Celestial contributes from H2 FY2028, ramping to $500M annualized by Q4 FY2029 and $1B by Q4 FY2030. The combined Marvell optical + CXL + Ethernet portfolio makes it one of the few companies that can compete with Broadcom across the full AI infrastructure interconnect layer without hardware from a single hyperscaler-captive vendor.
Core Concepts¶
ASIC Design Landscape¶
- Fixed-function ASICs (Broadcom Tomahawk, Trident): Hard-coded forwarding pipelines. Highest performance, lowest power, no flexibility. Dominated the market for 20 years.
- Programmable ASICs / NPUs: Cisco Silicon One, Intel Tofino (P4-programmable), Barefoot Networks heritage. Software-defined forwarding — update behavior without new silicon.
- SmartNICs / DPUs: Nvidia BlueField, Marvell Octeon, Broadcom Stingray. Move host networking, storage, and security offload off the CPU. Running full Linux stacks on-NIC.
- Trend: Line between switch ASIC and SmartNIC is blurring as DPUs gain switching capabilities.
P4 — Programming Protocol-Independent Packet Processors¶
- Match-Action Tables (MAT): Core abstraction. Match packet header fields → execute action (forward, drop, modify, meter, encap).
- Reconfigurable Match Tables (RMT): Physical representation in programmable ASICs — configurable width, depth, and match type.
- P4Runtime API: Standard gRPC-based control plane for populating MAT entries at runtime.
- Use cases: Custom routing protocols, telemetry (INT — In-band Network Telemetry), stateful firewalls, load balancing, SRv6, QUIC offload.
DPDK — Data Plane Development Kit¶
- Kernel bypass framework for user-space packet processing at line rate.
- Poll Mode Drivers (PMD) eliminate interrupt overhead — CPU spins polling the NIC ring buffer.
- Memory: Huge pages (2MB/1GB), NUMA-aware DPDK mempool.
- Typical throughput: 10-100 Mpps per core depending on packet size and processing complexity.
- Works with physical NICs (Intel IXGBE, Mellanox MLX5) and virtual (virtio, VHOST).
PCIe & CXL Interconnects¶
| Gen | Raw Rate (per lane) | x16 bandwidth | Key Use |
|---|---|---|---|
| PCIe 4.0 | 16 GT/s | ~32 GB/s | Current GPUs |
| PCIe 5.0 | 32 GT/s | ~64 GB/s | Latest CPUs/GPUs |
| PCIe 6.0 | 64 GT/s (PAM4) | ~128 GB/s | Emerging AI accelerators |
| PCIe 7.0 | 128 GT/s | ~256 GB/s | In development |
- CXL (Compute Express Link): Built on PCIe PHY. Three sub-protocols:
- CXL.io: PCIe-compatible I/O (devices, config space)
- CXL.cache: CPU-device cache coherency
- CXL.mem: CPU accesses device memory with host-managed coherency
- CXL 3.x: Multi-level switching (CXL fabric), peer-to-peer device-to-device coherency, memory pooling across rack.
Optical Networking¶
- 400G/800G ZR/ZR+ coherent: High-capacity DWDM transceivers moving into pluggable (QSFP-DD) form factors. Eliminating dedicated transponder shelves.
- Co-packaged optics (CPO): Optical dies co-packaged with switch ASIC to reduce electrical trace length and power. Intel, Broadcom, and Cisco pursuing this for 51.2T+ switches.
- Silicon Photonics: CMOS-compatible optical components enabling mass production. Key vendors: Intel, Cisco (Acacia), II-VI/Coherent.
SmartNIC / DPU Vendors¶
| Vendor | Product | Notes |
|---|---|---|
| Nvidia | BlueField-3 | Arm Cortex A72 + ConnectX-7. Dominant in AI/cloud |
| Marvell | Octeon 10 | MIPS + Arm, strong in telco/enterprise |
| Broadcom | Stingray | Arm-based, integrated PCIe switch |
| AMD/Pensando | Elba | Strong storage offload, acquired by AMD |
| Intel | IPU (Mount Evans) | Custom Arm cores, OCP contribution |
CXL 3.1 Enters Mainstream Production Deployment — 15–20% Hyperscaler TCO Reduction (2026)¶
By April 2026, CXL has transitioned from experimental to baseline production infrastructure. The defining milestone: CXL 3.1 on PCIe 6.1 physical layer is now the default memory disaggregation fabric in new hyperscale AI clusters, not an option. Linux has fully automated CXL memory enumeration — modern UEFI/BIOS automatically presents CXL-attached DDR as dedicated NUMA nodes with no manual configuration required. Measured impact: CXL-based architectures have reduced hyperscaler total cost of ownership by an estimated 15–20% through disaggregated memory provisioning — the same total memory capacity across fewer, denser compute nodes. The Marvell Structera S 30260 (CXL 3.0, 260-lane, Q3 2026 sampling) is the production switch enabling this; Panmnesia PANSWITCH (PCIe 6.4/CXL 3.2 fusion, H2 2026) is the challenger. The Beluga paper (already in this wiki) provided the killer app: LLM KV-cache disaggregation into CXL-attached pools is the workload that makes the 15–20% TCO figure concrete for AI infrastructure teams. The remaining barrier is CXL 3.x peer-to-peer coherence at multi-host scale — Marvell Structera S 30260 is the first product to enable this in customer sampling.
Open Questions¶
- Will P4 become the universal data plane programming model, or will vendor-specific DSLs persist due to hardware-specific optimizations?
- How does CXL memory pooling change the economics of AI training clusters — can shared CXL memory replace HBM for some workloads?
- CPO is not field-replaceable — how will hyperscalers manage laser failure in deployed CPO switches? Will they maintain optical-module spares at the linecard level?
- OCI-MSA v1.0 spec is now published at oci-msa.org — next milestone: first multi-vendor interoperability test results between member products. Does OCI-MSA supersede OIF's CPO mechanical specs or remain complementary?
- How do SmartNICs/DPUs change the threat model — running full Linux on the NIC means a new attack surface adjacent to the hypervisor.
- Broadcom BCM83640 3nm: when will first commercial 1.6T module shipments using this DSP occur? Which module vendors are first? Late 2026 mass production is the stated target.
- HyperLight TFLN 400G/lane PIC: when will the UMC 8-inch HVM line ship first commercial volume? Any announced module integrator beyond the TFC reference design?
- NVIDIA Spectrum-X Photonics H2 2026 ship confirmation: which infrastructure vendors (Dell, HPE, Supermicro) have confirmed availability dates?
- Open CPX MSA: will it produce a published spec before end-2026, or remain a positioning framework without binding mechanical/electrical specs?
- XPO MSA at 100 partners (April 6, 2026) with 40+ optical module vendors committed. Will first XPO-compliant module samples appear at OCP Summit 2026 or OFC 2027? What is the first OEM switch platform to integrate XPO chassis slots?
- OCI-MSA v1.0 spec is published at oci-msa.org — no formal multi-vendor interoperability test results yet. When will first announced test event be published?
- Broadcom Project Glasswing: the 200ns latency overhead claim is Broadcom's own figure — what does independent testing show? Does the P4-extended programming model impose any ABI compatibility constraints that would slow ecosystem adoption?