Product images are provided for reference and may not represent the exact model, configuration, or included components.

Overview

SKU: VCNRTXPRO6000B-PB
UPC: 751492798172
Condition: New
Write a Review 0% OFF

PNY VCNRTXPRO6000B-PB NVIDIA Blackwell Architecture 24 064 Cuda Cores 752 NVIDIA Tensor Cores 188

PNY VCNRTXPRO6000B-PB Blackwell GPU Accelerator Overview The PNY VCNRTXPRO6000B-PB is a high-end GPU accelerator built on NVIDIA's Blackwell architect…

$14,999.00 $14,998.99 SAVE $0
Ships same business day
In stock

Quantity:

Adding to cart… The item has been added
Compatibility guidance available for your deployment
Senior specialists for pre and post-sales support
Authorized sourcing and documentation support
Shipping and lead-time confirmation before install

Laura Bennett, IPSD Senior Specialist

Talk to Laura

200+ hrs training • U.S - based

Senior Specialist • 877-277-7147

PNY VCNRTXPRO6000B-PB NVIDIA Blackwell Architecture 24 064 Cuda Cores 752 NVIDIA Tensor Cores 188

$14,999.00
$14,998.99

Overview

SKU: VCNRTXPRO6000B-PB
UPC: 751492798172
Condition: New

No Bots, Just Experts

Questions about this product? Free pre-sales support from a senior specialist — product questions, compatibility checks, BOM quotes, price confirmation — typically answered within one business day. Need camera placement or system design work? Engineering time is $175 per hour (qty 1 = 1 hour). Hardware buyers get up to one hour ($175) credited back on their order.

Description

PNY VCNRTXPRO6000B-PB Blackwell GPU Accelerator

Overview

The PNY VCNRTXPRO6000B-PB is a high-end GPU accelerator built on NVIDIA's Blackwell architecture, delivering 120 TFLOPS of single-precision compute and up to 4 PFLOPS peak FP4 AI performance. With 96GB of GDDR7 memory and 1597 GB/s memory bandwidth, this dual-slot card is purpose-built for real-time surveillance analytics, edge AI inference, and multi-stream video processing at scale. The 752 Tensor Cores and 4x NVENC/NVDEC engines handle video encoding and decoding without taxing the host CPU—critical when processing dozens of camera feeds simultaneously in a surveillance operations center.

Key Features

  • 120 TFLOPS FP32 Compute: Handles real-time object detection, person counting, and multi-object tracking across multiple concurrent video streams. At 120 TFLOPS, you're not bottlenecked on inference latency even with high-resolution sources—analytics results arrive within frames, not seconds.
  • 4 PFLOPS Peak FP4 AI Performance: Optimized for low-precision AI workloads (FP4 quantized models), delivering roughly 33x higher throughput than FP32 for the same power budget. Means smaller model footprint, lower memory bandwidth pressure, and ability to run more concurrent inference tasks on one card.
  • 96GB GDDR7 Memory with 1597 GB/s Bandwidth: Eliminates memory bottlenecks when batching multiple video streams or running large language models for metadata enrichment. 96GB is enough to hold dozens of video frames plus a full AI model without constant host-system memory transfers. 1597 GB/s bandwidth keeps data flowing to compute cores at line rate.
  • 752 Tensor Cores and 188 RT Cores: Tensor Cores accelerate matrix operations in neural networks (convolutions, attention layers, fully-connected layers). RT Cores handle ray-tracing and rendering tasks. Together, they cut inference latency by 5–10x versus CPU-only video analytics.
  • 4x NVENC and 4x NVDEC Engines: Encode four 4K streams or eight 1080p streams to H.264/H.265 simultaneously without GPU compute impact. Decode four independent video sources in parallel. Critical advantage: you offload all video codec work from the main GPU pipeline, preserving compute capacity for analytics and inference tasks.
  • PCI Express 5.0 x16 and 512-bit Memory Interface: Ensures no host-PCIe bottleneck even in high-throughput surveillance deployments. PCIe 5.0 x16 supports 128 GB/s peer-to-peer throughput (vs. PCIe 4.0's 64 GB/s), enabling fast GPU-to-GPU communication in multi-GPU surveillance clusters.
  • Up to 4 Multi-Instance GPU (MIG) Partitions: Slice this single physical card into up to four logically independent GPUs, each with its own memory and compute resources. Deploy four separate analytics pipelines (different models, different customers) on one card without kernel-level collision. Ideal for multi-tenant surveillance platforms or when you want hard isolation between inference workloads.
  • 600W Maximum Power Consumption and Passive Cooling: No active fan means no noise in a data center or server room. Dual-slot form factor and passive thermal solution assume adequate chassis airflow; ensure your server has continuous airflow across the card. 600W maximum draws through a single PCIe CEM5 16-pin connector—pair this card with a server-class PSU rated 1600W or higher for typical dual-GPU or multi-card surveillance server configurations.
  • 4x DisplayPort 2.1 Connectors: Independent display outputs allow remote monitoring or operator dashboards direct from the GPU, though in a surveillance-server deployment these are typically unused (headless operation). DisplayPort 2.1 supports 80 Gbps bandwidth per port, enabling multi-4K monitor setups if needed for training, annotation, or model validation workflows.

Integration & Compatibility

The VCNRTXPRO6000B-PB (often searched as VCNRTXPRO6000B PB) fits into any x16 PCIe Gen 5 capable server—modern Intel Xeon (3rd gen and newer) or AMD EPYC systems. Requires CUDA 12.x runtime and recent NVIDIA driver stack. Works seamlessly with NVIDIA DeepStream (multi-stream video processing), TensorRT (inference optimization), and Video Codec SDK for custom video pipelines. Integrates with popular surveillance VMS platforms via RTSP ingest, HTTP metadata APIs, and industry-standard webhook alerts. No special licensing required for inference or video processing; NVIDIA's developer tools (CUDA Toolkit, TensorRT) are freely available.

What's in the Box

Exact package contents not confirmed by manufacturer documentation. Recommend verifying with your supplier whether mounting brackets, PCIe risers, or thermal interface materials are included.

Frequently Asked Questions

Q: Can the VCNRTXPRO6000B-PB replace a dedicated NVR?

A: No. The VCNRTXPRO6000B-PB is a GPU accelerator for compute—encoding, inference, analytics. It does not provide storage, recording, or playback like an NVR. Use it alongside an NVR or video server to offload heavy analytics and encoding workloads.

Q: What's the real-world latency for inference on the VCNRTXPRO6000B-PB?

A: Latency depends on model architecture and batch size. For a typical ResNet50 object detector at batch size 1, expect 10–20ms end-to-end (model + memory transfer). Batch processing (4–8 frames per inference call) improves throughput but increases end-to-end latency to 30–50ms. With TensorRT quantization (FP16 or INT8), latencies drop 2–3x.

Q: Does the VCNRTXPRO6000B-PB require a separate software license?

A: No. CUDA, TensorRT, and DeepStream are available free from NVIDIA. If you're using third-party analytics software (Milestone Xprotect with AI plug-in, Avigilon Control Center) that charges per GPU, that cost is separate.

Q: What's the maximum number of concurrent video streams this card can process?

A: Depends on resolution, frame rate, and model complexity. A rule of thumb: 16–32 concurrent 1080p30 streams with lightweight inference (object detection), or 4–8 concurrent 4K60 streams. The 4x NVDEC engines handle the decoding; GPU compute capacity determines concurrent inference load.

Q: Can I use two VCNRTXPRO6000B-PB cards in one server?

A: Yes, if your server has dual x16 PCIe Gen 5 slots and adequate power (1200W minimum PSU for dual cards). Multi-GPU configurations require PCIe peer-to-peer support and explicit CUDA programming to balance work across GPUs. NVIDIA NVLink is not available on this card, so GPU-to-GPU communication uses PCIe (still fast at 128 GB/s with Gen 5).

Q: Is the VCNRTXPRO6000B-PB suitable for outdoor surveillance?

A: No. This is a server-class GPU for indoor data centers or server rooms. It requires stable power, controlled temperature, and continuous airflow. For outdoor edge processing, consider NVIDIA Jetson modules (smaller, lower power) or deploy the VCNRTXPRO6000B-PB in a protected indoor facility and stream video to it over the network.

Ted Perry
Ted Perry

The PNY VCNRTXPRO6000B-PB is a serious piece of silicon for surveillance operations centers running multi-camera AI analytics at scale. With 120 TFLOPS of FP32 throughput and those 4x NVENC/NVDEC engines, this card decouples video encoding from your inference pipeline—a real win when you're feeding a dozen 4K streams into object detection models and need results back in frame time, not delayed a second or two by codec overhead.

Technical Highlights:

  • 1597 GB/s Memory Bandwidth with 96GB GDDR7: No memory stall when you're running batch inference across multiple frames. That bandwidth is critical for concurrent video decode + inference; you're not waiting for data to shuffle between host RAM and GPU.
  • 4x Parallel NVENC/NVDEC Engines: Encode or decode four independent 4K streams simultaneously. In a surveillance cluster, this means one card can transcode incoming camera feeds to different bitrates for different clients (local HD for operators, compressed remote delivery to mobile) without touching the compute cores.
  • 752 Tensor Cores + 188 RT Cores with 4 PFLOPS FP4 Peak: If you quantize your models to FP4 (many modern object detectors and segmentation models support it), you get roughly 33x more throughput compared to FP32. Means more concurrent streams or more complex models per card.
  • Up to 4 Multi-Instance GPU Partitions: Split this one physical card into four separate GPU partitions. Deploy four different analytics pipelines (person detection, vehicle classification, intrusion detection, license plate recognition) on one card with hard memory and compute isolation. No kernel panic if one model crashes.

Deployment Considerations:

  • Passive cooling means your server enclosure airflow is critical. In a dense rack with poor ventilation, thermal throttling will kill throughput. Ensure front-to-rear intake and exhaust planning before installation.
  • 600W maximum draw requires a server PSU rated 1600W+ (dual GPU setups need 2000W+). Budget for power-delivery infrastructure—a single 16-pin CEM5 connector can only do so much. If power delivery fails midstream, data loss is real.
  • PCIe Gen 5 x16 is non-negotiable; older servers (Intel Xeon 2nd gen, AMD EPYC 7002) won't even recognize the card. Verify motherboard BIOS support for Gen 5 before purchasing.

The VCNRTXPRO6000B-PB is the right choice for a large surveillance center running real-time AI analytics on 24/7 camera feeds where you need encode/decode offload and want to avoid GPU oversubscription. If you're processing fewer than 8–10 concurrent streams, this is overkill; reach for a smaller RTX Ada card instead. But if you're running a 50+ camera deployment with multiple concurrent AI models per frame, the 4x codec engines and 96GB memory footprint will keep your operations center responsive.

Specifications
Ir Lowlight: 850nm
Upc: 3536403403638
Tensor Cores: 752
RT Cores: 188
Single Precision Performance: 120 TFLOPS
Peak FP4 AI Performance: 4 PFLOPS
RT Core Performance: 355 TFLOPS
Gpu Memory: 96 GB GDDR7
Memory Interface: 512-bit
Memory Bandwidth: 1597 GB/s
Power Consumption: Up to 600W
Multi Instance Gpu: Up to 4
Nvenc: 4x
Nvdec: 4x
Graphics Bus: PCI Express 5.0 x16
Display Connectors: 4x DisplayPort 2.1
Form Factor: Dual slot
Thermal Solution: Passive
Power Connector: 1x PCIe CEM5 16-pin
Q&A
Reviews
Have Questions?

RELATED PRODUCTS

System Design, Deployment & Technical Support

Support services and planning resources for commercial surveillance, access control, and infrastructure deployments.

Fixed scope • Fixed price

System Design Assistance

  • Get help validating product compatibility
  • Coverage requirements
  • Storage planning and deployment architecture before you buy.
Request Design Help

Deployment & Configuration Support

  • Access fixed-scope support for rollout planning
  • User setup guidance
  • Migration and system standardization across single-site or multi-site deployments
View Support Services

Guides, Tools & Calculators

  • PoE requirements
  • Storage retention
  • Camera selection and deployment methodology
Open Technical Resources