The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.

The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.

Date of Patent:
Sep. 26, 2023

Filed:

Jan. 28, 2021
Applicant:

Beijing Baidu Netcom Science Technology Co., Ltd., Beijing, CN;

Inventors:

Xiameng Qin, Beijing, CN;

Yulin Li, Beijing, CN;

Qunyi Xie, Beijing, CN;

Ju Huang, Beijing, CN;

Junyu Han, Beijing, CN;

Attorney:
Primary Examiner:
Int. Cl.
CPC ...
G06F 16/9032 (2019.01); G06F 16/583 (2019.01); G06F 16/532 (2019.01); G06F 40/279 (2020.01); G06N 3/04 (2023.01); G06N 3/088 (2023.01); G06F 18/213 (2023.01); G06F 18/25 (2023.01); G06V 10/25 (2022.01); G06V 10/764 (2022.01); G06V 10/80 (2022.01); G06V 10/82 (2022.01); G06V 10/44 (2022.01);
U.S. Cl.
CPC ...
G06F 16/90332 (2019.01); G06F 16/532 (2019.01); G06F 16/583 (2019.01); G06F 18/213 (2023.01); G06F 18/253 (2023.01); G06F 40/279 (2020.01); G06N 3/04 (2013.01); G06N 3/088 (2013.01); G06V 10/25 (2022.01); G06V 10/454 (2022.01); G06V 10/764 (2022.01); G06V 10/806 (2022.01); G06V 10/82 (2022.01); G06V 2201/07 (2022.01);
Abstract

The present disclosure provides a method for visual question answering, which relates to a field of computer vision and natural language processing. The method includes: acquiring an input image and an input question; constructing a Visual Graph based on the input image, wherein the Visual Graph comprises a Node Feature and an Edge Feature; updating the Node Feature by using the Node Feature and the Edge Feature to obtain an updated Visual Graph; determining a question feature based on the input question; fusing the updated Visual Graph and the question feature to obtain a fused feature; and generating a predicted answer for the input image and the input question based on the fused feature. The present disclosure further provides an apparatus for visual question answering, a computer device and a non-transitory computer-readable storage medium.


Find Patent Forward Citations

Loading…