[Industry News] intel开源其NPU 加速库
Tofloor
poster avatar
流浪的加菲
deepin
2024-03-04 10:00
Author

英特尔已将其 NPU 加速库 (intel-npu-acceleration-library) 在 Apache-2.0 协议下开源。这是一个 Python 库,旨在利用英特尔神经处理单元 (NPU) 的强大功能在兼容硬件上执行高速计算,从而提高应用程序的效率。

The Intel® NPU Acceleration Library is a Python library designed to boost the efficiency of your applications by leveraging the power of the Intel Neural Processing Unit (NPU) to perform high-speed computations on compatible hardware.

该库目前正处于积极开发状态。为了显着提高库的性能,项目团队正在努力实现一系列关键功能。其中已实现的包括支持 8 位量化、Float16 支持、torch.compile 支持和 Static shape inference。计划实现的有:

  • 8-bit quantization
  • 4-bit Quantization and GPTQ
  • NPU-Native mixed precision inference
  • Float16 support
  • BFloat16 (Brain Floating Point Format)
  • torch.compile support
  • LLM MLP horizontal fusion implementation
  • Static shape inference
  • MHA NPU inference
  • NPU/GPU hetero compute
  • Paper

仓库页面上的 Python 代码示例还展示了在 NPU 上进行单矩阵乘法运算、为 NPU 编译模型,甚至在 NPU 上运行 Tiny-Llama 模型。

from intel_npu_acceleration_library.backend import MatMul
import numpy as np

inC, outC, batch = ... # Define your own values

# Create both inputs
X1 = np.random.uniform(-1, 1, (batch, inC)).astype(np.float16)
X2 = np.random.uniform(-1, 1, (outC, inC)).astype(np.float16)

mm = MatMul(inC, outC, batch, profile=False)

result = mm.run(X1, X2)

Some useful links

  • Intel AI PC (link)
  • Intel Core Ultra Processor line (link)
  • AI Acceleration and NPU explained (video)
Reply Favorite View the author
All Replies
neko
deepin
Ecological co-builder
2024-03-04 17:28
#1

看起来比amd的ipu方便多了

Reply View the author