The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.
The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.
Patent No.:
Date of Patent:
Jun. 04, 2024
Filed:
Nov. 17, 2021
Kwai Inc., Palo Alto, CA (US);
Zhendong Wang, Plano, TX (US);
Yongxiong Ren, San Jose, CA (US);
Yang Liu, San Jose, CA (US);
Lingzhi Liu, San Jose, CA (US);
BEIJING TRANSTREAMS TECHNOLOGY CO. LTD., Beijing, CN;
Abstract
A method and an apparatus for length-aware local tiling in a sparse attention module in a transformer in heterogeneous devices are provided. The method includes that a heterogeneous device including one or more GPUs: divides a transformed sparsity mask into a plurality of first tiles and obtaining one or more effective first tiles from the plurality of first tiles, where each effective first tile includes at least one non-zero element; loads the one or more effective first tiles into a shared memory in the one or more GPUs and loads a plurality of elements in a first matrix corresponding to the one or more effective first tiles into the shared memory; and performs multiplication by a first sampled dense-dense matrix multiplication (SDDMM) kernel in the sparse attention module in the transformer by fetching the one or more effective first tiles and the plurality of elements from the shared memory.