The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.

The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.

Date of Patent:
Jan. 14, 2025

Filed:

May. 24, 2023
Applicant:

Sambanova Systems, Inc., Palo Alto, CA (US);

Inventor:

Maulik Desai, Cedar Park, TX (US);

Assignee:

SambaNova Systems, Inc., Palo Alto, CA (US);

Attorneys:
Primary Examiner:
Int. Cl.
CPC ...
G06F 15/80 (2006.01); G06F 15/78 (2006.01); G06F 15/82 (2006.01); G06N 3/08 (2023.01); G06N 3/084 (2023.01);
U.S. Cl.
CPC ...
G06F 15/825 (2013.01); G06F 15/7867 (2013.01); G06N 3/08 (2013.01); G06N 3/084 (2013.01);
Abstract

As general matrix multiply (GEMM) bottlenecks are ameliorated by tensor parallelism that is distributed to several processors, layer normalization (LN) surfaces as a latent bottleneck as it is not amenable to distribution. LN performance is linear to embedding size, which is extremely large in some AI models. Moreover, aggressive tiling prevents the use of internal pipelining. The disclosed implementation addresses this issue, composing LN from simpler operations and this composition is amenable to pipelining, facilitating efficient implementation of large AI models (e.g., GPTs). In both forward and backward propagation, the pipeline is stretched longer with improved balance across stages. This strategy improves throughput for larger batch-sizes as the workload benefits from pipelining operations for better performance. Furthermore, avoiding stochastic rounding further improves performance. In addition, LayerNorm checkpoints facilitate efficient computation of gradients during backward propagation.


Find Patent Forward Citations

Loading…