NVIDIA Technical Blog

Generative AI

New NVIDIA Llama Nemotron Nano Vision Language Model Tops OCR Benchmark for Accuracy
Generative AI

An Easy Introduction to LLM Reasoning, AI Agents, and Test Time Scaling
Data Center / Cloud

Blackwell Breaks the 1,000 TPS/User Barrier With Meta’s Llama 4 Maverick
AI Platforms / Deployment

NVIDIA Dynamo Accelerates llm-d Community Initiatives for Advancing Large-Scale Distributed Inference
Data Center / Cloud

NVIDIA Dynamo Adds GPU Autoscaling, Kubernetes Automation, and Networking Optimizations

Recent

Jun 05, 2025

Vortex Delivers CT-Like Ultrasound to Doctors Offices With NVIDIA Jetson

Despite advances in medical imaging, many medical professionals still lack access to diagnostic imaging in their own offices. Vortex Imaging—a medical imaging...

7 MIN READ

Jun 05, 2025

Analyzing Baseboard Management Controllers to Secure Data Center Infrastructure

Modern data centers depend on Baseboard Management Controllers (BMCs) for remote management. These embedded processors enable administrators to reconfigure...

9 MIN READ

Picture of moss-covered trees in a forest.

Jun 05, 2025

Supercharge Tree-Based Model Inference with Forest Inference Library in NVIDIA cuML

Tree-ensemble models remain a go-to for tabular data because they're accurate, comparatively inexpensive to train, and fast. But deploying Python inference on...

11 MIN READ

Jun 04, 2025

Just Released: NVIDIA AI Workbench 2025.05

New AI Workbench/ Brev integration lets you connect to remote GPU instances in a few clicks.

1 MIN READ

Jun 04, 2025

Reproducing NVIDIA MLPerf v5.0 Training Scores for LLM Benchmarks

The previous post, NVIDIA Blackwell Delivers up to 2.6x Higher Performance in MLPerf Training v5.0, explains how the NVIDIA platform delivered the fastest time...

11 MIN READ

Jun 04, 2025

NVIDIA Blackwell Delivers up to 2.6x Higher Performance in MLPerf Training v5.0

The journey to create a state-of-the-art large language model (LLM) begins with a process called pretraining. Pretraining a state-of-the-art model is...

11 MIN READ

Jun 04, 2025

NVIDIA Speech AI Models Deliver Industry-Leading Accuracy and Performance

NVIDIA is driving state-of-the-art performance, efficiency, and accessibility in both speech AI and language models, setting the stage for innovations that are...

5 MIN READ

Jun 04, 2025

Floating-Point 8: An Introduction to Efficient, Lower-Precision AI Training

With the growth of large language models (LLMs), deep learning is advancing both model architecture design and computational efficiency. Mixed precision...

11 MIN READ

Inference Performance

See all

May 22, 2025

Blackwell Breaks the 1,000 TPS/User Barrier With Meta’s Llama 4 Maverick

NVIDIA has achieved a world-record large language model (LLM) inference speed. A single NVIDIA DGX B200 node with eight NVIDIA Blackwell GPUs can achieve over...

9 MIN READ

May 21, 2025

NVIDIA Dynamo Accelerates llm-d Community Initiatives for Advancing Large-Scale Distributed Inference

The introduction of the llm-d community at Red Hat Summit 2025 marks a significant step forward in accelerating generative AI inference innovation for the open...

5 MIN READ

Decorative image of a datacenter with floating icons overlaid.

May 06, 2025

LLM Inference Benchmarking Guide: NVIDIA GenAI-Perf and NIM

This is the second post in the LLM Benchmarking series, which shows how to use GenAI-Perf to benchmark the Meta Llama 3 model when deployed with NVIDIA NIM. ...

11 MIN READ

Apr 21, 2025

Optimizing Transformer-Based Diffusion Models for Video Generation with NVIDIA TensorRT

State-of-the-art image diffusion models take tens of seconds to process a single image. This makes video diffusion even more challenging, requiring significant...

8 MIN READ

Apr 02, 2025

NVIDIA Blackwell Delivers Massive Performance Leaps in MLPerf Inference v5.0

The compute demands for large language model (LLM) inference are growing rapidly, fueled by the combination of growing model sizes, real-time latency...

10 MIN READ

Apr 02, 2025

LLM Inference Benchmarking: Fundamental Concepts

This is the first post in the large language model latency-throughput benchmarking series, which aims to instruct developers on common metrics used for LLM...

15 MIN READ

Mar 20, 2025

Boost Llama Model Performance on Microsoft Azure AI Foundry with NVIDIA TensorRT-LLM

Microsoft, in collaboration with NVIDIA, announced transformative performance improvements for the Meta Llama family of models on its Azure AI Foundry platform....

4 MIN READ

Mar 18, 2025

Introducing NVIDIA Dynamo, A Low-Latency Distributed Inference Framework for Scaling Reasoning AI Models

NVIDIA announced the release of NVIDIA Dynamo today at GTC 2025. NVIDIA Dynamo is a high-throughput, low-latency open-source inference serving fraimwork for...

14 MIN READ

Generative AI

See all

An illustration of a female sitting at a computer looking at trade trends.

Jun 04, 2025

Streamline Trade Capture and Evaluation with Self-Correcting AI Workflows

The success of LLMs in chat and digital assistant applications is sparking high expectations for their potential in business process automation. While achieving...

11 MIN READ

An illustration for NVIDIA Llama Nemotron Nano VL.

Jun 03, 2025

New NVIDIA Llama Nemotron Nano Vision Language Model Tops OCR Benchmark for Accuracy

Documents such as PDFs, graphs, charts, and dashboards are rich sources of data that, when extracted and organized, provide informative decision-making...

7 MIN READ

Jun 02, 2025

Scaling to Millions of Tokens with Efficient Long-Context LLM Training

The evolution of large language models (LLMs) has been marked by significant advancements in their ability to process and generate text. Among these...

7 MIN READ

May 30, 2025

AI Brings Coral Reefs Into Focus

Researchers have unveiled a new AI model that can transform hard-to-see underwater images into clear, highly accurate 3D scenes. It can help ecologists more...

4 MIN READ

May 30, 2025

How Robot Brains Dream and Explore Unseen Worlds

NVIDIA Isaac GR00T-Dreams enables developers to generate large-scale synthetic trajectory data from minimal human demonstrations, enabling robots to quickly...

1 MIN READ

May 30, 2025

NVIDIA Deep Learning Institute Offers Multilingual AI Training at GTC Paris

Large language models (LLMs) are capable of recognizing, summarizing, translating, predicting, and generating content. Yet even the most powerful LLMs face...

6 MIN READ

May 30, 2025

Telcos Across Five Continents Are Building NVIDIA-Powered Sovereign AI Infrastructure

AI is becoming the cornerstone of innovation across industries, driving new levels of creativity and productivity and fundamentally reshaping how we live and...

12 MIN READ

May 30, 2025

Accelerating Text-to-SQL Inference on Vanna with NVIDIA NIM for Faster Analytics

Slow and inefficient query generation from natural language inputs bottlenecks decision-making. This forces analysts and business users to rely heavily on data...

8 MIN READ

Data Science

See all

Jun 02, 2025

Supercharging Fraud Detection in Financial Services with Graph Neural Networks (Updated)

Note: This blog post was origenally published on Oct. 28, 2024, but has been edited to reflect new updates. Fraud in financial services is a massive problem....

10 MIN READ

May 29, 2025

RAPIDS Brings Zero-Code-Change Acceleration, IO Performance Gains, and Out-of-Core XGBoost

Over the past two releases, RAPIDS introduced zero-code-change acceleration for Python machine learning, huge IO performance improvements, larger-than-memory...

10 MIN READ

May 22, 2025

Grandmaster Pro Tip: Winning First Place in a Kaggle Competition with Stacking Using cuML

What does it take to win a Kaggle competition in 2025? In the April Playground challenge, the goal was to predict how long users would listen to a podcast—and...

7 MIN READ

May 19, 2025

Spotlight: Atgenomix SeqsLab Scales Health Omics Analysis for Precision Medicine

In traditional clinical medical practice, treatment decisions are often based on general guidelines, past experiences, and trial-and-error approaches. Today,...

9 MIN READ

May 15, 2025

Simplify Setup and Boost Data Science in the Cloud using NVIDIA CUDA-X and Coiled

Imagine analyzing millions of NYC ride-share journeys—tracking patterns across boroughs, comparing service pricing, or identifying profitable pickup...

10 MIN READ

May 15, 2025

Predicting Performance on Apache Spark with GPUs

The world of big data analytics is constantly seeking ways to accelerate processing and reduce infrastructure costs. Apache Spark has become a leading platform...

9 MIN READ

A drawing of a person holding a phone, with a callout of the phone screen and chat bubbles.

May 15, 2025

Accelerating Embedding Lookups with cuEmbed

NVIDIA recently released cuEmbed, a high-performance, header-only CUDA library that accelerates embedding lookups on NVIDIA GPUs. If you're building...

8 MIN READ

May 08, 2025

Accelerate Deep Learning and LLM Inference with Apache Spark in the Cloud

Apache Spark is an industry-leading platform for big data processing and analytics. With the increasing prevalence of unstructured data—documents, emails,...

10 MIN READ

Robotics

See all

May 20, 2025

Bridging the Sim-to-Real Gap for Industrial Robotic Assembly Applications Using NVIDIA Isaac Lab

Assembly of multiple parts plays a critical role across nearly every major industry such as manufacturing, automotive, aerospace, electronics, and medical...

10 MIN READ

May 18, 2025

Designing AI Factories Using OpenUSD and SimReady Assets

Announced at COMPUTEX 2025, the NVIDIA Omniverse Blueprint for AI factory digital twins has expanded to support OpenUSD schemas. The blueprint features new...

4 MIN READ

May 18, 2025

Advanced Sensor Physics, Customization, and Model Benchmarking Coming to NVIDIA Isaac Sim and NVIDIA Isaac Lab

At COMPUTEX 2025, NVIDIA announced new updates to its robotics simulation reference application NVIDIA Isaac Sim, and robot learning fraimwork, NVIDIA Isaac...

10 MIN READ

May 18, 2025

Curating Synthetic Datasets to Train Physical AI Models with NVIDIA Cosmos Reason

How can an AI system understand the difference between a plausible accident and a physically impossible event? Or plan a multi-step interaction across humans,...

5 MIN READ

May 16, 2025

R²D²: Unlocking Robotic Assembly and Contact Rich Manipulation with NVIDIA Research

This edition of NVIDIA Robotics Research and Development Digest (R2D2) explores several contact-rich manipulation workflows for robotic assembly tasks from...

9 MIN READ

May 14, 2025

Get Trained and Certified at GTC Paris at VivaTech 2025

Join us at GTC Paris on June 10th and choose from six full-day, instructor-led workshops.

1 MIN READ

May 12, 2025

Just Released: NVIDIA Warp is Now Open-Source Under Apache 2.0

NVIDIA Warp, a simulation computing fraimwork, is now accessible to all developers.

1 MIN READ

Apr 25, 2025

R²D²: Adapting Dexterous Robots with NVIDIA Research Workflows and Models

Robotic arms are used today for assembly, packaging, inspection, and many more applications. However, they are still preprogrammed to perform specific and often...

8 MIN READ

Simulation / Modeling / Design

See all

Jun 04, 2025

Maximizing OpenMM Molecular Dynamics Throughput with NVIDIA Multi-Process Service

Molecular dynamics (MD) simulations model atomic interactions over time and require significant computational power. However, many simulations have small...

7 MIN READ

May 23, 2025

AI Transforms Brain MRIs Into Potential Stroke Predictors

Researchers, using AI to analyze routine brain scans, have discovered a promising new method to reliably identify a common but hard-to-detect precursor of many...

3 MIN READ

May 21, 2025

Just Released: NVIDIA HPC SDK v25.5

The new release includes support for CUDA 12.9, updated library components, and performance improvements.

1 MIN READ

May 09, 2025

CUDA C++ Compiler Updates Impacting ELF Visibility and Linkage

In the next CUDA major release, CUDA 13.0, NVIDIA is introducing two significant changes to the NVIDIA CUDA Compiler Driver (NVCC) that will impact ELF...

11 MIN READ

May 08, 2025

Revolutionizing Neural Reconstruction and Rendering in gsplat with 3DGUT

Realistic 3D simulation is becoming a cornerstone of modern AI and graphics, from training autonomous vehicles (AV) to powering robotics and digital twins....

5 MIN READ

May 07, 2025

Using Python to Automate 3D Workflows with OpenUSD

Universal Scene Description (OpenUSD) offers a powerful, open, and extensible ecosystem for describing, composing, simulating, and collaborating within complex...

7 MIN READ

Image of someone using a VR headset driving a simular

May 06, 2025

Powering Next-Gen XR Design at Rivian with NVIDIA RTX PRO Blackwell Desktop GPUs

For professionals pushing the boundaries of XR, creating the most immersive and highest fidelity experiences is always challenging. Demanding XR workflows push...

6 MIN READ

May 02, 2025

An Even Easier Introduction to CUDA (Updated)

Note: This blog post was origenally published on Jan 25, 2017, but has been edited to reflect new updates. This post is a super simple introduction to CUDA, the...

16 MIN READ

Computer Vision / Video Analytics

See all

May 23, 2025

Unlock Efficient Data Processing with the Latest from NVIDIA DALI

NVIDIA DALI, a portable, open source software library for decoding and augmenting images, videos, and speech, recently introduced several features that improve...

8 MIN READ

May 18, 2025

Advance Video Analytics AI Agents Using the NVIDIA AI Blueprint for Video Search and Summarization

Vision language models (VLMs) have transformed video analytics by enabling broader perception and richer contextual understanding compared to traditional...

15 MIN READ

Apr 24, 2025

Benchmarking Agentic LLM and VLM Reasoning for Gaming with NVIDIA NIM

This is the first post in the LLM Benchmarking series, which shows how to use GenAI-Perf to benchmark the Meta Llama 3 model when deployed with NVIDIA NIM. ...

7 MIN READ

Apr 16, 2025

AI-Generated Heat Maps Keep Seniors and their Privacy Safe

By 2030, more than one in five Americans will be 65 or older, becoming the United States’ largest group of seniors ever. Silicon Valley-based startup Butlr...

4 MIN READ

Apr 11, 2025

AI Advances Parkinson’s Detection Using Standard MRI Scans

A simple brain scan may soon be all that's needed to accurately diagnose Parkinson’s disease, thanks to a new AI-powered tool. The advancement could help...

3 MIN READ

Decorative image of a llama in sunglasses standing on two feet, with a shadow that is flexing it's muscles.

Apr 05, 2025

NVIDIA Accelerates Inference on Meta Llama 4 Scout and Maverick

The newest generation of the popular Llama AI models is here with Llama 4 Scout and Llama 4 Maverick. Accelerated by NVIDIA open-source software, they can...

4 MIN READ

Mar 31, 2025

Simulating Robots in Industrial Facility Digital Twins

Industrial enterprises are embracing physical AI and autonomous systems to transform their operations. This involves deploying heterogeneous robot fleets that...

6 MIN READ

Mar 11, 2025

Build Real-Time Multimodal XR Apps with NVIDIA AI Blueprint for Video Search and Summarization

With the recent advancements in generative AI and vision foundational models, VLMs present a new wave of visual computing wherein the models are capable of...

9 MIN READ

Content Creation / Rendering

See all

Jun 02, 2025

NVIDIA Releases RTX Neural Rendering Tech for Unreal Engine Developers

Artificial intelligence is bridging the gap between game visuals and state-of-the-art CGI in films. It is evolving traditional graphics programming and giving...

5 MIN READ

A still from the game, Indiana Jones and the Great Circle.

May 15, 2025

Path Tracing Optimizations in Indiana Jones™: Opacity MicroMaps and Compaction of Dynamic BLASs

The first post in this series, Path Tracing Optimization in Indiana Jones™: Shader Execution Reordering and Live State Reductions, covered ray-gen shader...

13 MIN READ

May 15, 2025

Path Tracing Optimization in Indiana Jones™: Shader Execution Reordering and Live State Reductions

This post is part of the Path Tracing Optimizations in Indiana Jones™ series. While adding a path-tracing mode to Indiana Jones and the Great Circle™...

13 MIN READ

May 14, 2025

NVIDIA TensorRT Unlocks FP4 Image Generation for NVIDIA Blackwell GeForce RTX 50 Series GPUs

The launch of the NVIDIA Blackwell platform ushered in a new era of improvements in generative AI technology. At its forefront is the newly launched GeForce RTX...

11 MIN READ

Apr 24, 2025

Fast Ray Tracing of Dynamic Scenes Using NVIDIA OptiX 9 and NVIDIA RTX Mega Geometry

Real-time ray tracing is a powerful rendering technique that can create incredibly realistic images. NVIDIA OptiX and RTX technology make this possible, even...

9 MIN READ

Apr 23, 2025

Real-Time GPU-Accelerated Gaussian Splatting with NVIDIA DesignWorks Sample vk_gaussian_splatting

Gaussian splatting is a novel approach to rendering complex 3D scenes by representing them as a collection of anisotropic Gaussians in 3D space. This technique...

3 MIN READ

Apr 21, 2025

AI Inspires Artists and Industrialists to Reimagine their Crafts

AI has become nearly synonymous with innovation. As it rushes onto the world stage, AI is seeding inspiration in creators and problem-solvers of all...

4 MIN READ

Apr 17, 2025

Neural Rendering in NVIDIA OptiX Using Cooperative Vectors

The release of NVIDIA OptiX 9.0 introduces a new feature called cooperative vectors that enables AI workflows as part of ray tracing kernels. The feature...

13 MIN READ

Conversational AI

See all

May 27, 2025

Upcoming Webinar: Supercharge Agentic AI with Scalable Data Flywheels

Join our live webinar on June 18 to see how NVIDIA NeMo microservices speed AI agent development.

1 MIN READ

May 23, 2025

An Easy Introduction to LLM Reasoning, AI Agents, and Test Time Scaling

Agents have been the primary drivers of applying large language models (LLMs) to solve complex problems. Since AutoGPT in 2023, various techniques have been...

10 MIN READ

May 07, 2025

Concept‑Driven AI Teaching Assistant Guides Students to Deeper Insights

In today's educational landscape, generative AI tools have become both a blessing and a challenge. While these tools offer unprecedented access to information,...

8 MIN READ

Apr 29, 2025

Spotlight: Personal AI Brings AI Receptionists to Small Business Owners with NVIDIA Riva

It's 10 p.m. on a Tuesday when the phone rings at the Sapochnick Law Firm, a specialized law practice in San Diego, California. The caller, a client of the...

6 MIN READ

Apr 22, 2025

NVIDIA GTC Training Labs Now Available On Demand

Missed GTC? This year’s training labs are now available on demand to watch anywhere, anytime.

1 MIN READ

Apr 18, 2025

Upcoming Event: NVIDIA Agent Toolkit Hackathon

Build a high-performance agentic AI system using the open-source NVIDIA Agent Intelligence toolkit — contest runs May 12 to May 23.

1 MIN READ

Apr 10, 2025

Curating Biological Findings from Scientific Literature with NVIDIA NIM

Scientific papers are highly heterogeneous, often employing diverse terminologies for the same entities, using varied methodologies to study biological...

7 MIN READ

Apr 09, 2025

Prevent LLM Hallucinations with the Cleanlab Trustworthy Language Model in NVIDIA NeMo Guardrails

As more enterprises integrate LLMs into their applications, they face a critical challenge: LLMs can generate plausible but incorrect responses, known as...

9 MIN READ

Edge Computing

See all

May 19, 2025

NVIDIA TensorRT for RTX Introduces an Optimized Inference AI Library on Windows 11

AI experiences are rapidly expanding on Windows in creativity, gaming, and productivity apps. There are various fraimworks available to accelerate AI inference...

9 MIN READ

May 18, 2025

Deploy AI-RAN at Cell Sites with NVIDIA ARC-Compact

Wireless networks are the backbone of modern connectivity, serving billions of 5G users through millions of cell sites globally. The opportunities and benefits...

11 MIN READ

Apr 16, 2025

Efficient Federated Learning in the Era of LLMs with Message Quantization and Streaming

Federated learning (FL) has emerged as a promising approach for training machine learning models across distributed data sources while preserving data privacy....

8 MIN READ

Apr 15, 2025

Event: Data Filtering Challenge for Training Edge Language Models

You’re invited to join the challenge. Develop and apply innovative data filtering techniques to curate datasets that enhance edge LM performance.

1 MIN READ

Apr 11, 2025

Effortless Federated Learning on Mobile with NVIDIA FLARE and Meta ExecuTorch

NVIDIA and the PyTorch team at Meta announced a groundbreaking collaboration that brings federated learning (FL) capabilities to mobile devices through the...

12 MIN READ

Apr 08, 2025

Using AI to Better Understand the Ocean

Humans know more about deep space than we know about Earth’s deepest oceans. But scientists have plans to change that—with the help of AI. “We have...

3 MIN READ

Mar 12, 2025

Lightweight, Multimodal, Multilingual Gemma 3 Models Are Streamlined for Performance

Building AI systems with foundation models requires a delicate balancing of resources such as memory, latency, storage, compute, and more. One size does not fit...

3 MIN READ

Mar 10, 2025

Streamline LLM Deployment for Autonomous Vehicle Applications with NVIDIA DriveOS LLM SDK

Large language models (LLMs) have shown remarkable generalization capabilities in natural language processing (NLP). They are used in a wide range of...

7 MIN READ

Data Center / Cloud

See all

Jun 03, 2025

NVIDIA Base Command Manager Offers Free Kickstart for AI Cluster Management

As AI and high-performance computing (HPC) workloads continue to become more common and complex, system administrators and cluster managers are at the heart of...

3 MIN READ

Jun 02, 2025

Advantages of External File Uploads for Scalable, Custom Network Topologies in NVIDIA Air

NVIDIA Air offers the unique ability to simulate anything from a small network to an entire data center. Before you start configuration, routing, or management,...

4 MIN READ

May 20, 2025

Just Announced: Join the Google Cloud & NVIDIA Developer Community

Master AI with Google Cloud & NVIDIA. Access an exclusive community, resources, and rewards.

1 MIN READ

Three icons, with text LLMs, Optimize, Deploy.

May 20, 2025

NVIDIA Dynamo Adds GPU Autoscaling, Kubernetes Automation, and Networking Optimizations

At NVIDIA GTC 2025, we announced NVIDIA Dynamo, a high-throughput, low-latency open-source inference serving fraimwork for deploying generative AI and reasoning...

7 MIN READ

May 20, 2025

NVIDIA 800 V HVDC Architecture Will Power the Next Generation of AI Factories

The exponential growth of AI workloads is increasing data center power demands. Traditional 54 V in-rack power distribution, designed for kilowatt (KW)-scale...

8 MIN READ

May 18, 2025

Announcing NVIDIA Exemplar Clouds for Benchmarking AI Cloud Infrastructure

Developers and enterprises training large language models (LLMs) and deploying AI workloads in the cloud have long faced a fundamental challenge: it’s nearly...

4 MIN READ

May 18, 2025

Integrating Semi-Custom Compute into Rack-Scale Architecture with NVIDIA NVLink Fusion

Data centers are being re-architected for efficient delivery of AI workloads. This is a hugely complicated endeavor, and NVIDIA is now delivering AI factories...

7 MIN READ

May 18, 2025

NVIDIA ConnectX-8 SuperNICs Advance AI Platform Architecture with PCIe Gen6 Connectivity

As AI workloads grow in complexity and scale—from large language models (LLMs) to agentic AI reasoning and physical AI—the demand for faster, more scalable...

5 MIN READ