The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.

The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.

Date of Patent:
Apr. 29, 2025

Filed:

Dec. 28, 2023
Applicant:

Intel Corporation, Santa Clara, CA (US);

Inventors:

Dipankar Das, Pune, IN;

Naveen K. Mellempudi, Bangalore, IN;

Mrinmay Dutta, Bangalore, IN;

Arun Kumar, Bangalore, IN;

Dheevatsa Mudigere, Bangalore, IN;

Abhisek Kundu, Bangalore, IN;

Assignee:

Intel Corporation, Santa Clara, CA (US);

Attorney:
Primary Examiner:
Int. Cl.
CPC ...
G06F 9/30 (2017.12); G06F 7/483 (2005.12); G06F 7/544 (2005.12); G06F 9/38 (2017.12); G06N 3/063 (2022.12);
U.S. Cl.
CPC ...
G06F 9/30014 (2012.12); G06F 7/483 (2012.12); G06F 7/5443 (2012.12); G06F 9/30036 (2012.12); G06F 9/30038 (2023.07); G06F 9/30145 (2012.12); G06F 9/3802 (2012.12); G06F 9/382 (2012.12); G06F 9/384 (2012.12); G06F 9/3887 (2012.12); G06F 9/3888 (2023.07); G06N 3/063 (2012.12); G06F 9/30065 (2012.12); G06F 2207/382 (2012.12);
Abstract

Disclosed embodiments relate to instructions for fused multiply-add (FMA) operations with variable-precision inputs. In one example, a processor to execute an asymmetric FMA instruction includes fetch circuitry to fetch an FMA instruction having fields to specify an opcode, a destination, and first and second source vectors having first and second widths, respectively, decode circuitry to decode the fetched FMA instruction, and a single instruction multiple data (SIMD) execution circuit to process as many elements of the second source vector as fit into an SIMD lane width by multiplying each element by a corresponding element of the first source vector, and accumulating a resulting product with previous contents of the destination, wherein the SIMD lane width is one of 16 bits, 32 bits, and 64 bits, the first width is one of 4 bits and 8 bits, and the second width is one of 1 bit, 2 bits, and 4 bits.


Find Patent Forward Citations

Loading…