The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.

The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.

Date of Patent:
Jan. 21, 2025

Filed:

Aug. 09, 2023
Applicant:

Deep Vision Inc., Los Altos, CA (US);

Inventors:

Wajahat Qadeer, Campbell, CA (US);

Rehan Hameed, Palo Alto, CA (US);

Satyanarayana Raju Uppalapati, Hyderabad, IN;

Abhilash Bharath Ghanore, Hyderabad, IN;

Kasanagottu Sai Ram, Hyderabad, IN;

Assignee:

Deep Vision Inc., Los Altos, CA (US);

Attorneys:
Primary Examiner:
Int. Cl.
CPC ...
G06N 3/0495 (2023.01); G06F 17/18 (2006.01); G06N 3/04 (2023.01); G06N 3/048 (2023.01); G06N 3/045 (2023.01);
U.S. Cl.
CPC ...
G06N 3/0495 (2023.01); G06F 17/18 (2013.01); G06N 3/04 (2013.01); G06N 3/048 (2023.01); G06N 3/045 (2023.01);
Abstract

A method includes, for each floating-point layer in a set of floating-point layers: calculating a set of input activations and a set of output activations of the floating-point layer; converting the floating-point layer to a low-bit-width layer; calculating a set of low-bit-width output activations based on the set of input activations; and calculating a per-layer deviation statistic of the low-bit-width layer. The method also includes ordering the set of low-bit-width layers based on the per-layer deviation statistic of each low-bit-width layer. The method additionally includes, while a loss-of-accuracy threshold exceeds the accuracy of the quantized network: converting a floating-point layer represented by the low-bit-width layer to a high-bit-width layer; replacing the low-bit-width layer with the high-bit-width layer in the quantized network; updating the accuracy of the quantized network; and, in response to the accuracy of the quantized network exceeding the loss-of-accuracy threshold, returning the quantized network.


Find Patent Forward Citations

Loading…