The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.

The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.

Patent No.:

US 12045307 B1

Date of Patent:

Jul. 23, 2024

Filed:

Oct. 30, 2020

Fine-grained per-vector scaling for neural network quantization

Applicant:

Nvidia Corporation, Santa Clara, CA (US);

Inventors:

Brucek Kurdo Khailany, Austin, TX (US);

Steve Haihang Dai, Union City, CA (US);

Rangharajan Venkatesan, San Jose, CA (US);

Haoxing Ren, Austin, TX (US);

Assignee:

NVIDIA Corporation, Santa Clara, CA (US);

Attorney:

Leydig, Voit & Mayer, Ltd.

Primary Examiner:

Matthew D Sandifer

Int. Cl.

CPC ...

G06F 17/16 (2006.01); G06F 5/01 (2006.01); G06F 7/544 (2006.01);

U.S. Cl.

CPC ...

G06F 17/16 (2013.01); G06F 5/01 (2013.01); G06F 7/5443 (2013.01);

Abstract

Today neural networks are used to enable autonomous vehicles and improve the quality of speech recognition, real-time language translation, and online search optimizations. However, operation of the neural networks for these applications consumes energy. Quantization of parameters used by the neural networks reduces the amount of memory needed to store the parameters while also reducing the power consumed during operation of the neural network. Matrix operations performed by the neural networks require many multiplication calculations, so reducing the number of bits that are multiplied reduces the energy that is consumed. Quantizing smaller sets of the parameters using a shared scale factor improves accuracy compared with quantizing larger sets of the parameters. Accuracy of the calculations may be maintained by quantizing and scaling the parameters using fine-grained per-vector scale factors. A vector includes one or more elements within a single dimension of a multi-dimensional matrix.

Find Patent Forward Citations