Nvidia

Nvidia Licenses Groq’s AI Tech, Hires CEO

Nvidia has entered a non-exclusive licensing agreement with Groq, a competitor in the AI chip industry, and plans to hire key figures from Groq, including its founder Jonathan Ross and president Sunny Madra. This strategic move is part of a larger deal reported by CNBC to be worth $20 billion, although Nvidia has clarified that it is not acquiring Groq as a company. The collaboration is expected to bolster Nvidia's position in the chip manufacturing sector, particularly as the demand for advanced computing power in AI continues to rise. Groq has been developing a new type of chip known as the Language Processing Unit (LPU), which claims to outperform traditional GPUs by running large language models (LLMs) ten times faster and with significantly less energy. These advancements could provide Nvidia with a competitive edge in the rapidly evolving AI landscape. Jonathan Ross, Groq's CEO, has a history of innovation in AI hardware, having previously contributed to the development of Google's Tensor Processing Unit (TPU). This expertise is likely to be a valuable asset for Nvidia as it seeks to expand its technological capabilities. Groq's rapid growth is evidenced by its recent $750 million funding round, valuing the company at $6.9 billion, and its expanding user base, which now includes over 2 million developers. This partnership with Nvidia could further accelerate Groq's influence in the AI sector. As the industry continues to evolve, the integration of Groq's innovative technology with Nvidia's established infrastructure could lead to significant advancements in AI performance and efficiency. This matters because it highlights the ongoing race in the tech industry to enhance AI capabilities and the importance of strategic collaborations to achieve these advancements.

Read Full Article

Posted on

Dec 25, 2025

by

Neural Nix

in

News

Topics: AI performance, Innovation, Nvidia

Enhancements in NVIDIA CUDA-Q QEC for Quantum Error Correction

Real-time decoding is essential for fault-tolerant quantum computers as it allows decoders to operate with low latency alongside a quantum processing unit (QPU), enabling corrections to be applied within the coherence time to prevent error accumulation. NVIDIA CUDA-Q QEC version 0.5.0 introduces several enhancements to support online real-time decoding, including GPU-accelerated algorithmic decoders, infrastructure for AI decoder inference, and sliding window decoder support. These improvements are designed to facilitate quantum error correction research and operationalize real-time decoding with quantum computers, utilizing a four-stage workflow: DEM generation, decoder configuration, decoder loading and initialization, and real-time decoding. The introduction of GPU-accelerated RelayBP, a new decoder algorithm, addresses the challenges of belief propagation decoders by incorporating memory strengths at each node of a graph. This approach helps to break harmful symmetries that typically hinder convergence in belief propagation, enabling more efficient real-time error decoding. Additionally, AI decoders are gaining traction for specific error models, offering improved accuracy or latency. CUDA-Q QEC now supports integrated AI decoder inference with offline decoding, making it easier to run AI decoders saved to ONNX files using an emulated quantum computer, and optimizing AI decoder operationalization with various model and hardware combinations. Sliding window decoders provide the ability to handle circuit-level noise across multiple syndrome extraction rounds, processing syndromes before the complete measurement sequence is received to reduce latency. While this approach may increase logical error rates, it offers flexibility in exploring noise model variations and error-correcting code parameters. The sliding window decoder in CUDA-Q QEC 0.5.0 allows users to experiment with different inner decoders and window sizes, providing a versatile tool for quantum error correction research. These advancements in CUDA-Q QEC 0.5.0 are crucial for accelerating the development of fault-tolerant quantum computers, enabling more reliable and efficient quantum computing operations. Why this matters: These advancements in quantum error correction are critical for the development of reliable and efficient quantum computers, paving the way for practical applications in various fields.

Read Full Article

Posted on

Dec 25, 2025

by

Neural Nix

in

Deep Dives, Tools

Topics: Nvidia, GPU, AI inference

NVIDIA ALCHEMI: Revolutionizing Atomistic Simulations

Machine learning interatomic potentials (MLIPs) are revolutionizing computational chemistry and materials science by enabling atomistic simulations that combine high fidelity with AI's scaling power. However, a significant challenge persists due to the lack of robust, GPU-accelerated tools for these simulations, which often rely on CPU-centric operations. NVIDIA ALCHEMI, announced at Supercomputing 2024, addresses this gap by providing a suite of high-performance, GPU-accelerated tools designed specifically for AI-driven atomistic simulations. The ALCHEMI Toolkit-Ops, part of this suite, offers accelerated operations like neighbor list construction and dispersion corrections, integrated with PyTorch for seamless use in existing workflows. ALCHEMI Toolkit-Ops employs NVIDIA Warp to enhance performance, offering a modular API accessible through PyTorch, with plans for JAX integration. This toolkit includes GPU-accelerated operations such as neighbor lists and DFT-D3 dispersion corrections, enabling efficient simulations of atomic systems. The toolkit's integration with open-source tools like TorchSim, MatGL, and AIMNet Central further enhances its utility, allowing for high-throughput simulations and improved computational efficiency without sacrificing accuracy. Benchmarks demonstrate its superior performance compared to existing kernel-accelerated models, making it a valuable resource for researchers in chemistry and materials science. Getting started with ALCHEMI Toolkit-Ops is straightforward, requiring Python 3.11+, a compatible operating system, and an NVIDIA GPU. Installation is facilitated via pip, and the toolkit is designed to integrate seamlessly with the broader PyTorch ecosystem. Key features include high-performance neighbor lists, DFT-D3 dispersion corrections, and long-range electrostatic interactions, all optimized for GPU computation. These capabilities enable accurate modeling of interactions critical for molecular simulations, providing a powerful tool for researchers. The toolkit's ongoing development promises further enhancements, making it a significant advancement in the field of computational chemistry and materials science. This matters because it accelerates research and development in these fields, potentially leading to breakthroughs in material design and drug discovery.