The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.
The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.
Patent No.:
Date of Patent:
Jul. 16, 2024
Filed:
Sep. 24, 2019
Baidu Usa, Llc, Sunnyvale, CA (US);
Baidu.com Times Technology (Beijing) Co., Ltd., Beijing, CN;
Baopu Li, Santa Clara, CA (US);
Yanwen Fan, Beijing, CN;
Zhiyu Cheng, Sunnyvale, CA (US);
Yingze Bao, Mountain View, CA (US);
Baidu USA LLC, Sunnyvale, CA (US);
Baidu.com Times Technology (Beijing) Co., Ltd., Beijing, CN;
Abstract
Deep neural networks (DNN) model quantization may be used to reduce storage and computation burdens by decreasing the bit width. Presented herein are novel cursor-based adaptive quantization embodiments. In embodiments, a multiple bits quantization mechanism is formulated as a differentiable architecture search (DAS) process with a continuous cursor that represents a possible quantization bit. In embodiments, the cursor-based DAS adaptively searches for a quantization bit for each layer. The DAS process may be accelerated via an alternative approximate optimization process, which is designed for mixed quantization scheme of a DNN model. In embodiments, a new loss function is used in the search process to simultaneously optimize accuracy and parameter size of the model. In a quantization step, the closest two integers to the cursor may be adopted as the bits to quantize the DNN together to reduce the quantization noise and avoid the local convergence problem.