The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.
The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.
Patent No.:
Date of Patent:
Sep. 15, 2020
Filed:
Sep. 29, 2018
Intel Corporation, Santa Clara, CA (US);
Jonathan Pearce, Hillsboro, OR (US);
David Sheffield, Portland, OR (US);
Srikanth Srinivasan, Portland, OR (US);
Jeffrey Cook, Portland, OR (US);
Deborah Marr, Portland, OR (US);
Abhijit Davare, Hillsboro, OR (US);
Asit Mishra, Hillsboro, OR (US);
Steven Burns, Portland, OR (US);
Desmond Kirkpatrick, Portland, OR (US);
Andrey Ayupov, Santa Clara, CA (US);
Anton Alexandrovich Sorokin, Portland, OR (US);
Eriko Nurvitadhi, Hillsboro, OR (US);
Intel Corporation, Santa Clara, CA (US);
Abstract
An apparatus and method for performing efficient, adaptable tensor operations. For example, one embodiment of a processor comprises: front end circuitry to schedule a plurality of matrix operations responsive to a tensor matrix multiplication instruction; a plurality of lanes to perform parallel execution of the matrix operations, each lane comprising: first, second, and third tile registers to store blocks of a first matrix (A), second matrix (B), and third matrix (C), respectively; at least one tensor arithmetic logic unit (TALU) to multiply a block of elements of the first matrix with a block of elements of the second matrix to generate a product and to accumulate the product with a block of elements of the third matrix, wherein each lane is to multiply one or more different blocks of the first and second matrix and to accumulate the resulting one or more products with one or more different blocks of the third matrix; and broadcast circuitry to broadcast one or more invariant matrix blocks to different tile registers within a lane and/or different tile registers across different lanes.