Product images are provided for reference and may not represent the exact model, configuration, or included components.

Questions about this product? Free pre-sales support from a senior specialist — product questions, compatibility checks, BOM quotes, price confirmation — typically answered within one business day. Need camera placement or system design work? Engineering time is $175 per hour (qty 1 = 1 hour). Hardware buyers get up to one hour ($175) credited back on their order.

Get Free Pre-Sales Support Buy Design Hours ($175/hr)

Description
Expert Analysis
Specifications
Product Video
Q&A
Reviews

Description

Lenovo 4X67A82325 AMD MI210 4-Way GPU Bridge

Overview

The Lenovo 4X67A82325 is a passive, full-height/full-length GPU bridge designed to link up to four AMD Instinct MI210 accelerators in a single server chassis via AMD Infinity Fabric Link. Rather than treating each card as an island of compute, this bridge enables direct GPU-to-GPU communication at memory bandwidths that would otherwise saturate a PCIe 4.0 fabric — a meaningful distinction in training workloads where inter-GPU data movement is a recurring bottleneck. If you are building a dense AI training node or a multi-GPU inferencing chassis on AMD's ROCm software stack, the 4X67A82325 is the interconnect piece that makes the MI210 cluster behave as a coherent compute unit rather than four independent accelerators.

Key Features

4-Way Infinity Fabric Link: Supports up to four MI210 GPUs in a bridged topology, enabling direct peer memory access without routing through the host CPU. In large-model training, this eliminates the CPU-mediated transfer overhead that compounds across gradient synchronization steps — every all-reduce operation runs faster when the path is GPU-to-GPU over fabric rather than GPU-to-PCIe-to-GPU.
64 GB HBM2E per GPU: Each MI210 paired through this bridge carries 64 GB of High Bandwidth Memory 2E. At 64 GB per card and four cards per node, a fully bridged chassis presents 256 GB of on-accelerator memory — enough to hold large language model parameter sets that would otherwise require model parallelism across multiple nodes.
Maximum Memory Bandwidth of 1.6 TB/s per GPU (16,000 GB/s listed aggregate): HBM2E's stacked architecture delivers memory bandwidth at a scale that traditional GDDR6 cannot match at equivalent capacity. For memory-bound workloads — transformer attention layers, large embedding lookups — this bandwidth figure determines whether compute units stay fed or stall waiting on data.
45 TFLOPS FP32 Vector / 23 TFLOPS FP32 Matrix per GPU: The MI210 distinguishes between vector and matrix operations: 45 TFLOPS for general-purpose FP32 vector compute, 23 TFLOPS for FP32 matrix operations. Most deep learning frameworks route dense linear algebra through matrix paths, so the 23 TFLOPS figure is the relevant number for transformer-based training throughput — not the higher vector headline.
Matched FP64 Performance (45 TFLOPS vector / 23 TFLOPS matrix): The MI210 delivers identical FP64 throughput to its FP32 figures — a design decision oriented toward HPC and scientific compute workloads where double-precision accuracy is non-negotiable. If your workload is mixed-precision training only, this parity represents headroom you may not use; if you are running climate modeling, seismic processing, or computational fluid dynamics alongside AI workloads, it matters significantly.
6,656 Stream Processors across 104 Compute Units: The MI210's compute array is built for sustained parallel throughput rather than burst clock performance. At 104 compute units with 6,656 stream processors, the card targets workloads that keep the full array occupied — large batch sizes, dense matrix multiplications, and parallel inference across many simultaneous requests.
ECC Memory Support: HBM2E ECC is enabled on the MI210, which matters in long-running training jobs where a single undetected memory error can corrupt a checkpoint and waste hours of GPU time. In HPC environments where bit-error rates are a compliance concern, ECC is often a hard requirement.
PCIe 4.0 x16 Host Interface: The bridge connects to the host via PCIe 4.0 x16, providing approximately 64 GB/s of bidirectional bandwidth to the host CPU. This is the right interface for current-generation AMD EPYC and Intel Xeon server platforms. Verify your chassis and CPU generation support PCIe 4.0 before committing — PCIe 3.0 hosts will negotiate down and reduce host-to-GPU transfer throughput.
Passive Cooling: The 4X67A82325 bridge is passively cooled, meaning it depends entirely on chassis airflow for thermal management. This is standard for high-density GPU servers with active rear-exhaust cooling, but it is a constraint: deploy only in systems with validated airflow for the MI210 thermal envelope. Passive-only means no self-regulation — insufficient airflow leads to thermal throttling at the accelerator level.
ROCm 5.0 Software Platform: The MI210 is validated on AMD's ROCm 5.0 open compute platform. ROCm provides HIP (Heterogeneous-compute Interface for Portability) as the CUDA-parallel programming model, with support for PyTorch, TensorFlow, JAX, and major HPC libraries. If your team's existing code is CUDA-native, budget for a HIP porting effort — ROCm compatibility has improved substantially but is not drop-in equivalent for every library or custom kernel.

Integration & Compatibility

The 4X67A82325 bridge is designed for deployment in Lenovo ThinkSystem servers validated for the AMD Instinct MI210 accelerator. The full-height/full-length (FH/FL) form factor requires a chassis with adequate PCIe slot depth and physical clearance for the MI210 card complement. Verify the specific ThinkSystem platform compatibility matrix before ordering — GPU bridge interoperability is chassis-specific, not universal across all PCIe 4.0 servers. ROCm 5.0 is the validated software stack; later ROCm releases may introduce additional support but confirm with the platform release notes. For teams building datacenter compute infrastructure on AMD's open-source GPU ecosystem, the MI210 with this bridge is the Lenovo-validated path to multi-GPU scale-up within a single node.

Frequently Asked Questions

Q: What is the purpose of the 4X67A82325 bridge — is it required for multi-GPU operation?

A: The 4X67A82325 provides AMD Infinity Fabric Link interconnect between up to four MI210 GPUs in a single chassis. Without it, MI210 cards communicate through the PCIe fabric via the host CPU, which is slower and less efficient for workloads that require frequent inter-GPU data exchange, such as distributed training. For applications where GPUs operate independently (embarrassingly parallel inference), the bridge is less critical; for tightly coupled training, it is the recommended configuration.

Q: Does the MI210 with this bridge support FP64 double-precision compute?

A: Yes. The AMD MI210 delivers 45 TFLOPS FP64 vector performance and 23 TFLOPS FP64 matrix performance — matching its FP32 figures. This makes the 4-way bridged MI210 configuration suitable for HPC workloads requiring double-precision accuracy, not just mixed-precision AI training.

Q: Is ECC memory supported?

A: Yes. The MI210's HBM2E memory supports ECC (Error-Correcting Code), which detects and corrects single-bit memory errors during operation. This is important for long training runs and HPC workloads where undetected memory errors can corrupt results or checkpoints.

Q: What software stack does the MI210 run on?

A: The validated platform is AMD ROCm 5.0, which supports PyTorch, TensorFlow, JAX, and major HPC libraries through the HIP programming model. Teams migrating from CUDA-based workflows should evaluate HIP compatibility for their specific kernels and libraries before deployment.

Q: Does the 4X67A82325 have active cooling?

A: No — this bridge is passively cooled. It requires chassis airflow from the server system to maintain safe operating temperatures for the MI210 accelerators. Confirm your target server platform has validated airflow for the MI210 thermal requirements before deployment.

Q: What PCIe generation does the 4X67A82325 require?

A: The interface is PCI Express x16 Gen 4.0. Deploying in a PCIe 3.0 host will result in link negotiation to the lower generation, reducing host-to-GPU bandwidth. For full performance, the host server must support PCIe 4.0.

James Everett

The 4X67A82325 is the piece most people overlook when spec'ing a multi-MI210 node — the bridge that determines whether you have four independent 45 TFLOPS FP32 compute islands or a coherent scale-up fabric. The Infinity Fabric Link it provides is what makes all-reduce operations in distributed training meaningful at the node level: inter-GPU bandwidth goes through the fabric, not through PCIe and back to the host, which is the difference between keeping your gradient synchronization from becoming a bottleneck as batch sizes grow.

Technical Highlights:

64 GB HBM2E per GPU: At four cards per node, you have 256 GB of accelerator-local memory — enough to stage large model parameter sets without offloading to host DRAM or NVMe swap, which would crater training throughput.
Matched FP32/FP64 throughput: 45 TFLOPS vector and 23 TFLOPS matrix in both precisions means this platform serves dual-purpose deployments — AI training at FP32/BF16 mixed precision during the week, HPC simulation runs at full FP64 without re-provisioning hardware.
Passive thermal design: Zero moving parts on the bridge itself, but this is not forgiveness for poor chassis airflow planning. The MI210 thermal envelope is real — validate server airflow before committing to a chassis, not after the cards are seated.

Deployment Considerations:

ROCm 5.0 is the validated software platform; confirm library compatibility for your specific PyTorch or TensorFlow version before imaging nodes, as ROCm minor-version mismatches have caused silent performance degradation in distributed training configurations.
PCIe 4.0 x16 is a hard requirement for full bandwidth — a PCIe 3.0 host will link-negotiate down and you will see host-to-GPU transfer throughput cut roughly in half, which surfaces as a data-loading bottleneck in I/O-heavy training pipelines.

This configuration fits a named scenario precisely: a single-node AMD-native AI training server where the team is committed to ROCm, needs 256 GB of coherent accelerator memory for large-model work, and requires FP64 parity for mixed HPC/AI workloads — not a general-purpose GPU cluster build where CUDA ecosystem breadth is the deciding factor.

Expert Analysis

Specifications

Weight: 1.00 lb

Unspsc Code: 43201503

Product type: 4-way graphics card bridge

Cooling type: Passive

Interface type: PCI Express x16 4.0

Compute units: 104

Stream processors: 6656

Discrete graphics card memory: 64 GB

Graphics card memory type: High Bandwidth Memory 2E (HBM2E)

ECC: Yes

Memory bandwidth (max: 16000 GB/s

Infinity Fabric Link: Yes

Number of GPUs supported: 4

Form factor: Full-Height/Full-Length (FH/FL)

Product colour: Black

Software platform: ROCm 5.0

Peak Single Precision Matrix (FP32) performance: 23 TFLOPS

Peak Double Precision Matrix (FP64) performance: 23 TFLOPS

Peak Single Precision Vector (FP32) performance: 45 TFLOPS

Peak Double Precision Vector (FP64) performance: 45 TFLOPS

Q&A

Reviews

Product Video

Have Questions?

Write a Review

Lenovo 4X67A82325 AMD MI210 4X Link BRG

Passive, full-height/full-length GPU bridge designed to link up to four AMD Instinct MI210 accelerators in a single server chassis via AMD Infinity

$2,631.99

Usually Ships in 2-3 Weeks

Current Stock:

Quantity:

Adding to cart… The item has been added

Compatibility guidance available for your deployment

Senior specialists for pre and post-sales support

Channel-direct sourcing and documentation support

Shipping and lead-time confirmation before install

Talk to Laura

200+ hrs training • U.S - based

Senior Specialist • 877-277-7147

Message Call

Weight 1.00 lb
Unspsc Code 43201503
Weight 1.00 lb
Product type 4-way graphics card bridge
Cooling type Passive
Interface type PCI Express x16 4.0
Compute units 104
Stream processors 6656
Discrete graphics card memory 64 GB
Graphics card memory type High Bandwidth Memory 2E (HBM2E)
ECC Yes
Memory bandwidth (max 16000 GB/s
Infinity Fabric Link Yes
Number of GPUs supported 4
Form factor Full-Height/Full-Length (FH/FL)
Product colour Black
Software platform ROCm 5.0
Peak Single Precision Matrix (FP32) performance 23 TFLOPS
Peak Double Precision Matrix (FP64) performance 23 TFLOPS
Peak Single Precision Vector (FP32) performance 45 TFLOPS
Peak Double Precision Vector (FP64) performance 45 TFLOPS

Condition New

UPC 889488634415

Availability Usually Ships in 2-3 Weeks

Support & deployment services are available

Lenovo 4X67A82325 AMD MI210 4X Link BRG

$2,631.99

RELATED PRODUCTS

Lenovo 4X67A81102

Lenovo 4X67A81102 AMD Instinct MI210 PCIE4 PAS

N AMD Instinct MI210 PCIe accelerator card designed for compute-intensive server deployments

Usually Ships in 2-3 Weeks Free shipping over $499

$36,487.99

Add to Cart

The item has been added Add to quote

$36,487.99

Add to cart Add to quote Compare

AMD 100-300000008H

AMD 100-300000008H Instinct MI210

Dual-GPU accelerator module built on 5nm FinFET technology, designed for high-throughput AI inference, video analytics, and compute-intensive

In stock · Ships same business day Free shipping over $499

$10,177.99

Add to Cart

The item has been added Add to quote

$10,177.99

Add to cart Add to quote Compare

Lenovo 4X97A88017

Lenovo 4X97A88017 Chi-char A100/A6000/MI210 PWR

Power conversion component designed to support NVIDIA A100, A6000, and MI210 GPU accelerators in Lenovo server and workstation platforms

Usually Ships in 2-3 Weeks Free shipping over $499

$126.99

Add to Cart

The item has been added Add to quote

$126.99

Add to cart Add to quote Compare

Lenovo 4XG7A90286

Lenovo 4XG7A90286 SR645 V3 AMD 9734 112C 340W 2.2GHZ

Factory-configured AMD EPYC 9734 processor option for the ThinkSystem SR645 V3 server platform — a 112-core, 224-thread Genoa-family CPU running at a

Usually Ships in 2-3 Weeks Free shipping over $499

$21,784.99

Add to Cart

The item has been added Add to quote

$21,784.99

Add to cart Add to quote Compare

Lenovo 4XG7A90628

Lenovo 4XG7A90628 SR665 AMD Epyc 7203 8C 120W 2.8GHZ

Factory-new AMD EPYC 7203P processor option for the ThinkSystem SR665 platform — an 8-core, 16-thread CPU running at a 2.8 GHz base clock with a 3.4

Usually Ships in 2-3 Weeks Free shipping over $499

$1,827.99

Add to Cart

The item has been added Add to quote

$1,827.99

Add to cart Add to quote Compare

Looking for more Lenovo products? Shop the full Lenovo catalog →

Get help validating product compatibility
Coverage requirements
Storage planning and deployment architecture before you buy.

Request Design Help

Access fixed-scope support for rollout planning
User setup guidance
Migration and system standardization across single-site or multi-site deployments

View Support Services

PoE requirements
Storage retention
Camera selection and deployment methodology

Open Technical Resources

Build a complete system in one click

Build Your Complete System

No Bots, Just Experts

Lenovo 4X67A82325 AMD MI210 4-Way GPU Bridge

Overview

Key Features

Integration & Compatibility

Frequently Asked Questions

Lenovo 4X67A82325 AMD MI210 4X Link BRG

Build a complete system in one click

Build Your Complete System

RELATED PRODUCTS

Lenovo 4X67A81102 AMD Instinct MI210 PCIE4 PAS

AMD 100-300000008H Instinct MI210

Lenovo 4X97A88017 Chi-char A100/A6000/MI210 PWR

Lenovo 4XG7A90286 SR645 V3 AMD 9734 112C 340W 2.2GHZ

Lenovo 4XG7A90628 SR665 AMD Epyc 7203 8C 120W 2.8GHZ

System Design, Deployment & Technical Support

System Design Assistance

Deployment & Configuration Support

Guides, Tools & Calculators

Have Question?