What Is an NPU? Simply Explained

NPUs are appearing in every new laptop and phone. Here is what Neural Processing Units actually do, why they matter, and how they make your devices smarter.

Quick Answer: An NPU (Neural Processing Unit) is a specialized processor chip designed specifically for AI and machine learning tasks, found in modern phones, laptops, and tablets.

Every new laptop and phone in 2026 mentions an NPU in its spec sheet. Intel, Qualcomm, Apple, AMD, Google, and MediaTek all include them now. Microsoft made a specific NPU performance threshold a legal requirement for the Copilot+ PC certification. Despite all this, most people buying devices have no idea what an NPU is, what it does, or why the performance number on the box matters.

The short answer: an NPU is a dedicated processor for AI tasks. It runs on-device AI - photo enhancement, voice transcription, real-time translation, background noise removal - using a fraction of the power that doing the same work on a CPU or GPU would require. Here is the full explanation.

What NPU Stands For

NPU stands for Neural Processing Unit. It is a specialized chip designed to run the mathematics underlying artificial intelligence and machine learning - specifically a type of calculation called matrix multiplication - as efficiently as possible.

The "neural" in Neural Processing Unit refers to neural networks, the AI architecture that powers everything from voice recognition to photo enhancement. NPUs are built to run these networks fast, with very low power draw.

Different companies use different names for the same concept:

Apple Neural Engine - Apple's name for the NPU inside A-series (iPhone) and M-series (Mac, iPad) chips
Hexagon NPU - Qualcomm's NPU in Snapdragon mobile and laptop chips
Tensor Processing Unit (TPU) - Google's term for the dedicated AI engine inside its Tensor G-series chips
AI Boost - Intel's NPU tile inside Core Ultra processors
Ryzen AI - AMD's NPU branding for Ryzen laptop chips

How an NPU Works (Without the Jargon)

Your device has three main processors working together:

The CPU (Central Processing Unit) handles general tasks: running apps, managing files, executing code
The GPU (Graphics Processing Unit) handles visual workloads: rendering games, video playback, 3D
The NPU handles AI math: recognizing faces, transcribing speech, generating images, translating language

AI workloads are dominated by a specific type of calculation: multiplying large matrices of numbers together, millions of times per second. A CPU can do this, but it is designed for sequential logic, not parallel math. A GPU is better at parallel math but consumes significant power. An NPU is purpose-built for exactly this one kind of calculation, making it 10 to 20 times more power-efficient than a CPU for the same AI task.

Processor	Designed For	AI Efficiency	Power Consumption	Can run AI?
CPU	General logic, sequential tasks	Low	Medium	Yes, slowly
GPU	Parallel graphics and math	Medium-High	High	Yes, but power-hungry
NPU	AI matrix math only	Highest	Very Low	Yes - optimally

What TOPS Means and Why It Matters

NPU performance is measured in TOPS: Tera Operations Per Second. One TOPS means the NPU can perform one trillion mathematical operations per second. Higher TOPS means faster, more capable on-device AI.

TOPS matters for two practical reasons:

Copilot+ PC certification requires 40+ TOPS (up to 80 TOPS on Snapdragon X2 Elite) - Microsoft established this threshold to ensure a laptop's NPU can run Windows AI features like Windows Recall, Cocreator in Paint, and Live Caption translation. A laptop with a 30 TOPS NPU cannot run these features at all.
More TOPS means more complex AI models can run locally - A 50 TOPS NPU can run larger language models and more sophisticated image processing on-device than a 35 TOPS chip, without sending data to the cloud.

Current NPU Performance by Chip (April 2026)

Chip	NPU Performance	Found In	Category
Apple A19	~38 TOPS (est.)	iPhone 17, iPhone 17 Plus, iPhone 17 Air	Mobile
Apple A19 Pro	~40 TOPS (est.)	iPhone 17 Pro, iPhone 17 Pro Max	Mobile
Apple M4	38 TOPS	MacBook MacBook Air M516-core Neural Engine - fastest on-device AI for laptops🛒 Check Price on Amazon MacBook Air M316-core Neural Engine - fastest on-device AI for laptops🛒 Check Price on Amazon Air M4, iPad Pro M5	Laptop/Tablet
Apple M4 Pro	38 TOPS	MacBook Pro 14"/16" M4 Pro	Laptop
Qualcomm Snapdragon 8 Elite Gen 5	~50 TOPS (est.)	Samsung Galaxy S26 series	Mobile
Qualcomm Snapdragon X2 Elite	45 TOPS	Surface Pro 11, Dell XPS 13, Lenovo Yoga	Laptop (Copilot+)
Qualcomm Snapdragon X2 Plus	45 TOPS	Mid-range Copilot+ laptops	Laptop (Copilot+)
Google Tensor G5	~45 TOPS TPU (est.)	Pixel 10, Pixel 10 Pro	Mobile
Intel Core Ultra 200V	up to 48 TOPS	Dell XPS 13, Asus Zenbook, Samsung Galaxy Book5	Laptop (Copilot+)
AMD Ryzen AI 9 HX 370	50 TOPS	ASUS ProArt, Lenovo ThinkPad X1 Extreme	Laptop (Copilot+)
AMD Ryzen AI 300 series	50 TOPS	ASUS ROG Zephyrus, various OEMs	Laptop (Copilot+)

Note on Apple's approach: Apple does not officially publish TOPS figures for the Neural Engine. The A19 and A19 Pro estimates are derived from third-party benchmarks and extrapolation from the A18/A19 Pro performance data (35 TOPS). Both chips are built on TSMC's 3nm process, the same node used for Qualcomm's Snapdragon 8 Elite Gen 5. Apple integrates its Neural Engine more tightly with the CPU and GPU than competitors, so raw TOPS comparisons consistently understate Apple's real-world AI performance on Apple-optimized tasks.

Note on Tensor G5: Google's Tensor G5 (Pixel 10) features a TPU that is 60% larger than the Tensor G5 used in Pixel 10. Google describes this as significantly more capable for on-device Gemini Nano inference. The ~45 TOPS figure is an estimate from independent analysis; Google does not publish official TOPS figures for Tensor chips.

What NPUs Actually Do on Your Devices

On Your Phone

Computational photography - Night mode, Smart HDR, Portrait mode depth mapping, and Live Photos processing all run on the NPU. Every photo you take on an iPhone 17, Pixel 10, or Galaxy S26 is processed by AI before it saves.
Real-time speech transcription - Live Captions on Android and live voicemail transcription on iPhone run on the NPU. No cloud upload required.
Face ID and biometric unlock - Facial recognition processing happens entirely on the NPU, in a secure enclave. Your face geometry never leaves your device.
On-device translation - iPhone 17 and Pixel 10 can both translate languages locally without an internet connection, enabled by NPU-resident language models. Samsung Galaxy S26's Live Translate runs entirely on the Snapdragon 8 Elite Gen 5's Hexagon NPU - call translation with zero cloud dependency.
Background noise removal - When you make a call in a noisy environment, the NPU filters out ambient sound in real time. This runs locally, not on a server.
On-device AI summaries - Apple Intelligence's Writing Tools, notification summaries, and email prioritization on iPhone 17 run on the A19/A19 Pro Neural Engine. Gemini Nano on Pixel 10 handles local inference for similar tasks via Tensor G5's TPU.

On Your Laptop

Background removal and blur in video calls - Zoom, Teams, and Google Meet offload background separation to the NPU when available. This frees the CPU for other tasks and extends battery life.
Eye contact correction - Apple's Center Stage and Microsoft's Eye Contact feature move your eye gaze to look directly at the camera even when you are reading notes. This runs continuously on the NPU.
Windows Recall - Microsoft's feature that takes periodic screenshots of your activity and makes them searchable with natural language requires the NPU to run the vision-language model locally. This is why 40+ TOPS (up to 80 TOPS on Snapdragon X2 Elite) is mandatory for Copilot+ PCs.
Real-time transcription and translation in calls - Teams Premium's real-time transcription runs on the NPU on Copilot+ PCs, reducing latency and working partially offline.
Local image generation - AMD and Intel have demonstrated running Stable Diffusion models entirely on-device on their 50 TOPS NPUs. Generating a 512x512 image takes under 10 seconds on a Ryzen AI 9 without GPU involvement.
Smart noise cancellation - Microphone noise suppression in professional communication tools like Krisp runs on the NPU, not the CPU, leaving CPU resources available for other tasks during calls.

The Three Major Mobile NPU Architectures in 2026

Apple A19 Pro, Qualcomm Snapdragon 8 Elite Gen 5, and Google Tensor G5 represent three distinct approaches to mobile AI processing, all built on TSMC's 3nm node.

Apple's Neural Engine in the A19 Pro is the most vertically integrated: Apple designs the chip, the operating system, and every major application that uses it. Apple Intelligence's Writing Tools, Visual Intelligence, and Siri's personal context features are co-optimized with the specific Neural Engine architecture in ways that produce performance advantages the raw TOPS number does not capture.

Qualcomm's Hexagon NPU in the Snapdragon 8 Elite Gen 5 is the most widely deployed mobile AI chip: it powers Samsung Galaxy S26, OnePlus, and other Android flagships. The Hexagon NPU handles Samsung Galaxy AI's on-device Live Translate, Circle to Search preprocessing, and image enhancement features across the most diverse range of Android OEM implementations.

Google's Tensor G5 TPU takes a third approach: Google designs the chip specifically for Gemini Nano inference, meaning the TPU is purpose-built for the exact model architecture Google deploys on-device. The 60% TPU size increase from Tensor G5 to G5 directly enables faster local Gemini Nano responses on Pixel 10 without requiring cloud roundtrips for the tasks Gemini Nano handles.

Why NPUs Matter for Privacy

This is the NPU benefit that gets the least attention in marketing materials, but it is arguably the most important.

Without an NPU, AI features send your data to servers for processing. A voice assistant without an NPU uploads your audio to a data center, processes it remotely, and returns a result. Your voice is transmitted, potentially stored, and processed by systems outside your control.

With an NPU powerful enough to handle the same task locally, nothing leaves your device. Your audio is transcribed by a model running on your phone's Neural Engine. Your photos are enhanced by a model that never sees a server. Your messages are summarized by a process that runs in your pocket.

Apple has built its entire Apple Intelligence marketing around this distinction: most Apple Intelligence features run entirely on the A19/A19 Pro Neural Engine. Samsung Galaxy S26's on-device Live Translate means call content never reaches Samsung's servers. Google Tensor G5's expanded TPU handles more Gemini Nano inference locally than prior Tensor chips. The industry-wide pattern is consistent: NPU performance is the limiting factor for how much AI can stay private, and every major chip manufacturer is expanding NPU capacity accordingly.

Copilot+ PC Requirements Explained

Microsoft introduced the Copilot+ PC certification in mid-2024 with a strict hardware requirement: the device must have an NPU delivering 40 TOPS or more. This threshold gates access to several Windows AI features:

Windows Recall - Semantic search of your screen history
Cocreator in Paint - Real-time AI image generation alongside drawing
Live Captions with translation - Real-time translation of any audio playing on your PC
Auto Super Resolution - AI upscaling of games and video using NPU instead of GPU
Windows Studio Effects - Background blur, eye contact, voice focus in video calls

Laptops with Intel Core Ultra 200V (up to 48 TOPS), AMD Ryzen AI 300 (50 TOPS), and Qualcomm Snapdragon X (45 TOPS) all qualify. Most laptop chips from before mid-2024 do not meet the threshold. The A19 Pro's estimated ~40 TOPS means Apple silicon is now in the same tier as Copilot+ certified chips for raw NPU performance, though Apple uses its own framework for on-device AI rather than Windows-specific features.

Do You Actually Need a High TOPS NPU?

If you are buying a phone or laptop in 2026, you will get an NPU regardless - every current mainstream chip includes one. The question is whether a higher TOPS number justifies spending more.

High TOPS matters if you:

Use Copilot+ features on Windows (requires 40+)
Run local AI models for image generation or document summarization
Use video call software heavily and want low-latency background effects
Want offline AI translation, transcription, or note-taking at speed - including Samsung Galaxy S26's on-device Live Translate
Want the fastest Gemini Nano responses on a Pixel phone (Tensor G5 vs G4 is a meaningful gap)

TOPS is less important if you:

Mostly use cloud-based AI tools (ChatGPT, Gemini via browser, Claude) - those run on servers regardless of your NPU
Do basic computing: web browsing, documents, video streaming
Use an iPhone with Apple Intelligence - Apple's Neural Engine integration is efficient enough that A19/A19 Pro handles the current feature set well, with Apple's software co-optimization closing the gap versus higher raw TOPS competitors

Sources

Frequently Asked Questions

Is an NPU the same as a GPU?

No. A GPU renders graphics and handles parallel visual computation. An NPU handles AI matrix math. They are distinct chips serving different purposes, though GPUs can run AI tasks less efficiently. Modern chips - Apple A19 Pro, Snapdragon 8 Elite Gen 5, Tensor G5 - include all three: CPU cores, GPU cores, and an NPU on the same die.

Can I add an NPU to my existing laptop?

No. The NPU is built into the processor (SoC) and cannot be added, upgraded, or replaced separately. If you want a more powerful NPU, you need a new device with a newer chip. There is no PCIe or USB NPU expansion available for consumer devices.

Does using the NPU drain battery faster?

The opposite. NPUs are designed specifically to be power-efficient. Running an AI task on the NPU uses dramatically less power than the same task on the CPU or GPU. Background removal in a 30-minute video call might add 2-3% battery drain via NPU versus 8-10% via CPU. NPUs exist precisely because battery life on mobile devices matters - and this is why Tensor G5's larger TPU enables more Gemini Nano inference without corresponding battery penalty.

Does my older phone or laptop have an NPU?

Most flagship phones from 2018 onwards include some form of NPU. The iPhone 8 had the first Apple Neural Engine. Qualcomm Snapdragon 855 (2019) and later include Hexagon NPUs. Google Tensor chips (Pixel 6, 2021) include TPUs. Laptops are newer to this - most mainstream laptops had no dedicated NPU before 2023. Check your chip model against the TOPS table above to see where you stand.

Why does Apple not publish TOPS numbers?

Apple does not officially disclose TOPS figures, partly because raw TOPS comparisons favor chips with higher headline numbers. Apple argues - with supporting evidence from third-party benchmarks on Apple-optimized tasks - that its Neural Engine architecture and tight integration with the CPU and GPU produce better real-world AI performance than raw TOPS suggest. The A19 and A19 Pro (TSMC 3nm) estimates of ~38-40 TOPS come from independent benchmarks, not Apple. For Apple Intelligence tasks specifically, the co-optimization advantage over higher-TOPS chips from AMD and Qualcomm is consistent across independent testing.

How does Tensor G5's 60% larger TPU translate to real-world performance?

Google's Tensor G5 TPU improvement enables Pixel 10 to run larger Gemini Nano model variants on-device, handle more Gemini Nano queries locally before escalating to the cloud, and process on-device tasks with lower latency than Tensor G5. Practically, this means Gemini responses for local queries (photo analysis, summarization, offline translation) are faster on Pixel 10 than Pixel 10, and more tasks that previously required a network roundtrip now complete entirely on-device - with the associated privacy benefit.