The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.

The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.

Date of Patent:
Mar. 05, 2024

Filed:

Sep. 19, 2022
Applicant:

Friendliai Inc., Seoul, KR;

Inventors:

Gyeongin Yu, Seoul, KR;

Geon-Woo Kim, Seoul, KR;

Joo Seong Jeong, Seoul, KR;

Soojeong Kim, Seoul, KR;

Byung-Gon Chun, Seoul, KR;

Assignee:

FRIENDLIAI INC., Seoul, KR;

Attorney:
Primary Examiner:
Assistant Examiner:
Int. Cl.
CPC ...
G06F 16/2455 (2019.01); G06F 40/284 (2020.01); G06N 20/10 (2019.01);
U.S. Cl.
CPC ...
G06N 20/10 (2019.01); G06F 16/2455 (2019.01); G06F 40/284 (2020.01);
Abstract

An inference system applies a machine-learning transformer model to a batch of requests with variable input length or variable target length or variable internal sate length by selectively batching a subset of operations in the transformer model but processing requests in the batch individually for a subset of operations in the transformer model. In one embodiment, the operation to be processed individually is an attention operation of an encoder or a decoder of the transformer model. By selective batching, the inference system can allow batching operations to be performed for a batch of requests with variable input or target length or internal state length to utilize the parallel computation capabilities of hardware accelerators while preventing unnecessary computations that occur for workarounds that restrain the data of a batch of requests to a same length.


Find Patent Forward Citations

Loading…