Product images are provided for reference and may not represent the exact model, configuration, or included components.

Overview

SKU: 900-5G153-2200-000-01
Condition: New
Write a Review

NVIDIA 900-5G153-2200-000-01 RTX PRO 6000 Blackwell Max-q Workstation Edition - Bulk

NVIDIA 900-5G153-2200-000-01 RTX PRO 6000 Blackwell Workstation GPU Overview The NVIDIA 900-5G153-2200-000-01 is a professional-grade datacenter acce…

$12,172.99
Ships same business day
In stock

Quantity:

Adding to cart… The item has been added
Compatibility guidance available for your deployment
Senior specialists for pre and post-sales support
Authorized sourcing and documentation support
Shipping and lead-time confirmation before install

Laura Bennett, IPSD Senior Specialist

Talk to Laura

200+ hrs training • U.S - based

Senior Specialist • 877-277-7147

NVIDIA 900-5G153-2200-000-01 RTX PRO 6000 Blackwell Max-q Workstation Edition - Bulk

$12,172.99

Overview

SKU: 900-5G153-2200-000-01
Condition: New

No Bots, Just Experts

Questions about this product? Free pre-sales support from a senior specialist — product questions, compatibility checks, BOM quotes, price confirmation — typically answered within one business day. Need camera placement or system design work? Engineering time is $175 per hour (qty 1 = 1 hour). Hardware buyers get up to one hour ($175) credited back on their order.

Description

NVIDIA 900-5G153-2200-000-01 RTX PRO 6000 Blackwell Workstation GPU

Overview

The NVIDIA 900-5G153-2200-000-01 is a professional-grade datacenter accelerator built on the Blackwell architecture, engineered for compute-intensive workloads in AI inference, 3D rendering, data processing, and scientific simulation. With 24,064 CUDA cores, fifth-generation Tensor cores, and 96 GB of GDDR7 memory, this dual-slot, full-height card integrates into enterprise-class x16 PCIe 5.0 systems where throughput and memory capacity directly drive ROI on model deployment and batch-processing pipelines.

Key Features

  • 24,064 CUDA Cores + 5th Gen Tensor Cores: Massive parallel compute capacity designed for GPU-accelerated AI training and inference—each core multiplies by the number of simultaneous inference requests your pipeline can handle without queue buildup.
  • 96 GB GDDR7 Memory with 1,792 GB/s Bandwidth: Large on-device memory footprint eliminates frequent host-device transfers, cutting inference latency in real-time video analytics or autonomous systems. The 512-bit memory interface sustains high-bandwidth operations required for vision transformers and large language model token generation.
  • 110 TFLOPS Single Precision Performance: Delivers deterministic throughput for FP32 workloads—critical when your SLA demands predictable latency per video frame or data-processing batch.
  • 333 TFLOPS RT Core Performance (4th Gen Ray Tracing): Accelerates ray-traced rendering pipelines and physics simulation; if your deployment includes synthetic data generation for computer vision training, this spec directly reduces per-frame rendering time.
  • 3,511 AI TOPS: Peak AI performance metric reflecting combined FP8 and lower-precision operations—the actual throughput available when running quantized or sparsity-optimized inference models in production.
  • 4x NVENC + 4x NVDEC Video Engines: Hardware video encoding and decoding at scale—allows simultaneous transcoding of multiple video streams without consuming CUDA cores. If you're running a surveillance or broadcast application ingesting dozens of RTSP streams, each stream can be re-encoded independently without GPU compute contention.
  • PCIe 5.0 x16 Interface: 128 GB/s bidirectional host bandwidth eliminates the CPU-GPU bottleneck common in PCIe 4.0 systems, reducing time spent moving datasets to/from system memory.
  • 4x DisplayPort 2.1b Outputs + > 4x 4096×2160 @ 120 Hz Display Support: Supports high-resolution workstation displays and multi-monitor visualization workflows without requiring additional IO cards.
  • MIG (Multi-Instance GPU) up to 4x 24 GB Partitions: Slice the GPU into isolated compute instances, allowing multiple users or workloads to run simultaneously with guaranteed QoS—essential in shared infrastructure where one job shouldn't starve another.
  • 300W TDP with Single PCIe CEM5 16-pin Power Connector: Moderate power draw compared to dual-slot footprint; most enterprise server PSUs already supply sufficient power. No exotic cooling requirements beyond standard datacenter airflow.
  • Active Thermal Solution: Integrated active cooling (onboard fan) maintains safe operating temperatures without requiring water loops, reducing installation complexity and maintenance overhead.
  • Full API Support: CUDA 12.8, OpenCL 3.0, DirectCompute, DirectX 12, OpenGL 4.6, Vulkan 1.3—wide compatibility with existing inference frameworks (TensorRT, PyTorch, JAX) and graphics applications.

Integration & Compatibility

Fits into any standard x16 PCIe 5.0 slot in enterprise-class servers (Dell PowerEdge, HPE ProLiant, Lenovo ThinkSystem, Supermicro). Works natively with major AI inference frameworks and graphics APIs. Requires x16 electrical connectivity (8-pin + 16-pin CEM5 connector supplies 300W). Compatible with standard enterprise monitoring tools for GPU power and thermal telemetry via NVIDIA Management Library (NVML) and NVIDIA Data Center GPU Manager (DCGM).

Frequently Asked Questions

Q: What workloads is the 900-5G153-2200-000-01 optimized for?

A: The RTX PRO 6000 Blackwell is engineered for high-precision AI inference, 3D rendering, scientific simulation, and data-center workloads where large memory capacity (96 GB) and deterministic compute performance matter more than peak gaming-class throughput.

Q: Can I use the 900-5G153-2200-000-01 in a multi-GPU configuration?

A: Yes. Multiple units can be installed in a single server (if available slots and power supply allow) or clustered across servers via high-speed interconnects. Refer to your server's PCIe slot topology and BIOS settings.

Q: What is the memory bandwidth advantage of 1,792 GB/s?

A: That bandwidth ensures model weights and activations move quickly between GPU memory and compute cores. For large language models or vision transformers, this reduces the memory-access penalty on per-token latency.

Q: Does the 900-5G153-2200-000-01 support NVIDIA's CUDA ecosystem?

A: Yes. Full CUDA 12.8 support with NVIDIA's complete toolkit (cuDNN, cuBLAS, CUTLASS, TensorRT) for optimized inference and training frameworks (PyTorch, TensorFlow, JAX).

Q: What is the power requirement for the 900-5G153-2200-000-01?

A: 300 W maximum. Requires one 16-pin PCIe CEM5 power connector. Typical enterprise server PSUs (1000W+) supply more than enough capacity for a single GPU.

Q: Can I partition the 900-5G153-2200-000-01 across multiple users?

A: Yes, using NVIDIA MIG. Supports up to 4 independent 24 GB GPU instances, allowing separate workloads or users to run in isolation with guaranteed resource allocation.

Ted Perry
Ted Perry

When I spec the 900-5G153-2200-000-01 into a compute cluster, the 96 GB GDDR7 footprint is the deciding factor. That memory capacity eliminates the constant page-faulting you see in smaller-GPU setups, and paired with the 1,792 GB/s bandwidth, it means your inference latency stays predictable even under sustained load. The Blackwell architecture delivers what matters for production: deterministic performance, not peak theoretical numbers.

Technical Highlights:

  • 24,064 CUDA cores with 5th-gen Tensor cores: Massive parallel processing—each core contributes to throughput, so you can run higher concurrent inference requests or batch larger datasets without latency creep. The tensor cores accelerate matrix operations, the core of modern AI workloads.
  • 1,792 GB/s memory bandwidth (512-bit interface): Ensures GPU cores aren't starved waiting for data. For vision transformers processing high-resolution frames or LLMs generating tokens, this bandwidth directly cuts inference time per request.
  • 4x NVENC + 4x NVDEC engines: Independent video processing units mean you can transcode or re-encode multiple streams without consuming a single CUDA core. In a surveillance or broadcast ingest pipeline, this is how you stay cost-efficient.
  • MIG (Multi-Instance GPU) support—up to 4x 24 GB partitions: Slice one GPU across teams or tenants with guaranteed isolation and QoS. No job interference, no noisy neighbor, no manual load-balancing headaches.

Deployment Considerations:

  • Requires PCIe 5.0 x16 electrical support—older PCIe 4.0 servers will work electrically but won't realize the full bandwidth advantage. Verify your server BIOS supports PCIe 5.0 negotiation before speccing.
  • 300W TDP is moderate for the compute density, but sustained load requires adequate cooling airflow. Install in enterprise-class racks with front-to-back airflow; dense clusters may need additional intake planning.
  • CUDA 12.8 ecosystem is mature, but your inference framework (TensorRT, PyTorch, etc.) must be built against CUDA 12.x. Older CUDA 11.x codebases need recompilation.

Deploy the 900-5G153-2200-000-01 when your requirement is sustained, predictable inference throughput—not peak speed. Multi-tenant environments, large-batch document processing, and real-time video analytics are where this card earns its cost. Avoid it if your workload is latency-sensitive, single-request AI (where a smaller, lower-latency GPU may be better), or if you're constrained to PCIe 4.0 infrastructure.

Specifications
Gpu Architecture: NVIDIA Blackwell
Cuda Cores: 24,064
Tensor Cores: 5th Generation
Ray Tracing Cores: 4th Generation
Ai Tops: 3511
Single Precision Performance: 110 TFLOPS
Rt Core Performance: 333 TFLOPS
Gpu Memory: 96 GB GDDR7
Memory Interface: 512-bit
Memory Bandwidth: 1792 GB/s
System Interface: PCIe 5.0 x16
Display Connectors: 4x DisplayPort 2.1b
Max Simultaneous Displays: > 4x 4096 x 2160 @ 120 Hz
Video Engines: 4x NVENC, 4x NVDEC
Mig Instance Types: Up to 4x 24 GB
Total Board Power: 300 W
Power Connector: 1x PCIe CEM5 16-pin
Thermal Solution: Active
Form Factor: Dual slot, full height
Graphics Apis: Directx 12, Shader Model 6.6, OpenGL 4.6, Vulkan 1.3
Compute Apis: CUDA 12.8, OpenCL 3.0, DirectCompute
Q&A
Reviews
Have Questions?

RELATED PRODUCTS

System Design, Deployment & Technical Support

Support services and planning resources for commercial surveillance, access control, and infrastructure deployments.

Fixed scope • Fixed price

System Design Assistance

  • Get help validating product compatibility
  • Coverage requirements
  • Storage planning and deployment architecture before you buy.
Request Design Help

Deployment & Configuration Support

  • Access fixed-scope support for rollout planning
  • User setup guidance
  • Migration and system standardization across single-site or multi-site deployments
View Support Services

Guides, Tools & Calculators

  • PoE requirements
  • Storage retention
  • Camera selection and deployment methodology
Open Technical Resources