The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.

The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.

Date of Patent:
Jan. 24, 2023

Filed:

Jul. 15, 2020
Applicant:

Salesforce.com, Inc., San Francisco, CA (US);

Inventors:

Yue Wang, Singapore, SG;

Chu Hong Hoi, Singapore, SG;

Shafiq Rayhan Joty, Singapore, SG;

Assignee:

Salesforce.com, Inc., San Francisco, CA (US);

Attorney:
Primary Examiner:
Int. Cl.
CPC ...
G06F 40/30 (2020.01); G06F 21/36 (2013.01); G06F 40/35 (2020.01); G06N 3/08 (2006.01); G06F 40/284 (2020.01); G06K 9/62 (2022.01);
U.S. Cl.
CPC ...
G06F 40/35 (2020.01); G06F 40/284 (2020.01); G06K 9/6217 (2013.01); G06N 3/08 (2013.01);
Abstract

A visual dialogue model receives image input and text input that includes a dialogue history between the model and a current utterance by a human user. The model generates a unified contextualized representation using a transformer encoder network, in which the unified contextualized representation includes a token level encoding of the image input and text input. The model generates an encoded visual dialogue input from the unified contextualized representation using visual dialogue encoding layers. The encoded visual dialogue input includes a position level encoding and a segment type encoding. The model generates an answer prediction from the encoded visual dialogue input using a first self-attention mask associated with discriminative settings or a second self-attention mask associated with generative settings. Dense annotation fine tuning may be performed to increase accuracy of the answer prediction. The model provides the answer prediction as a response to the current utterance of the human user.


Find Patent Forward Citations

Loading…