The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.

The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.

Date of Patent:
May. 08, 2018

Filed:

Jun. 16, 2016
Applicant:

Baidu Usa, Llc, Sunnyvale, CA (US);

Inventors:

Kan Chen, Los Angeles, CA (US);

Jiang Wang, Santa Clara, CA (US);

Wei Xu, Saratoga, CA (US);

Assignee:

Baidu USA LLC, Sunnyvale, CA (US);

Attorney:
Primary Examiner:
Int. Cl.
CPC ...
G06K 9/00 (2006.01); G06K 9/66 (2006.01); G06F 17/27 (2006.01); G06K 9/46 (2006.01); G06K 9/62 (2006.01); G06T 1/60 (2006.01); G06N 3/02 (2006.01);
U.S. Cl.
CPC ...
G06K 9/66 (2013.01); G06F 17/2785 (2013.01); G06K 9/46 (2013.01); G06K 9/6256 (2013.01); G06K 9/6267 (2013.01); G06N 3/02 (2013.01); G06T 1/60 (2013.01); G06K 2009/4666 (2013.01);
Abstract

Described herein are systems and methods for generating and using attention-based deep learning architectures for visual question answering task (VQA) to automatically generate answers for image-related (still or video images) questions. To generate the correct answers, it is important for a model's attention to focus on the relevant regions of an image according to the question because different questions may ask about the attributes of different image regions. In embodiments, such question-guided attention is learned with a configurable convolutional neural network (ABC-CNN). Embodiments of the ABC-CNN models determine the attention maps by convolving image feature map with the configurable convolutional kernels determined by the questions semantics. In embodiments, the question-guided attention maps focus on the question-related regions and filters out noise in the unrelated regions.


Find Patent Forward Citations

Loading…