For the Inventor, By the Inventor

The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.

The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.

Patent No.:

US 12361600 B1

Date of Patent:

Jul. 15, 2025

Filed:

May. 23, 2023

Systolic arithmetic on sparse data

Applicant:

Intel Corporation, Santa Clara, CA (US);

Inventors:

Abhishek R. Appu, El Dorado Hills, CA (US);

Prasoonkumar Surti, Folsom, CA (US);

Jill Boyce, Portland, OR (US);

Subramaniam Maiyuran, Gold River, CA (US);

Michael Apodaca, Folsom, CA (US);

Adam T. Lake, Portland, OR (US);

James Holland, Folsom, CA (US);

Vasanth Ranganathan, El Dorado Hills, CA (US);

Altug Koker, El Dorado Hills, CA (US);

Lidong Xu, Beijing, CN;

Nikos Kaburlasos, Folsom, CA (US);

Assignee:

Intel Corporation, Santa Clara, CA (US);

Attorney:

Jaffery Watson Hamilton & DeSanctis LLP

Primary Examiner:

Fayyaz Alam

Int. Cl.

CPC ...

G06T 9/00 (2006.01); G06N 3/045 (2023.01); G06T 15/00 (2011.01);

U.S. Cl.

CPC ...

G06T 9/002 (2013.01); G06N 3/045 (2023.01); G06T 9/007 (2013.01); G06T 9/008 (2013.01); G06T 15/005 (2013.01);

Abstract

Embodiments described herein provided for an instruction and associated logic to enable a processing resource including a tensor accelerator to perform optimized computation of sparse submatrix operations. One embodiment provides a parallel processor comprising a processing cluster coupled with the cache memory. The processing cluster includes a plurality of multiprocessors coupled with a data interconnect, where a multiprocessor of the plurality of multiprocessors includes a tensor core configured to load tensor data and metadata associated with the tensor data from the cache memory, wherein the metadata indicates a first numerical transform applied to the tensor data, perform an inverse transform of the first numerical transform, perform a tensor operation on the tensor data after the inverse transform is performed, and write output of the tensor operation to a memory coupled with the processing cluster.

Find Patent Forward Citations

Loading…