The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.

The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.

Date of Patent:
Nov. 19, 2024

Filed:

Apr. 07, 2022
Applicant:

Baidu Usa, Llc, Sunnyvale, CA (US);

Inventors:

Hongliang Fei, Sunnyvale, CA (US);

Tan Yu, Bellevue, WA (US);

Ping Li, Bellevue, WA (US);

Assignee:
Attorney:
Primary Examiner:
Int. Cl.
CPC ...
G06F 18/214 (2023.01); G06F 40/20 (2020.01); G06F 40/51 (2020.01); G06N 5/022 (2023.01);
U.S. Cl.
CPC ...
G06F 18/2148 (2023.01); G06F 40/20 (2020.01); G06F 40/51 (2020.01); G06N 5/022 (2013.01);
Abstract

Current pretrained vision-language models for cross-modal retrieval tasks in English depend upon on the availability of many annotated image-caption datasets for pretraining to have English text. However, the texts are not necessarily in English. Although machine translation (MT) tools may be used to translate text to English, the performance largely relies on MT's quality and may suffer from high latency problems in real-world applications. Embodiments herein address these problems by learning cross-lingual cross-modal representations for matching images and their relevant captions in multiple languages. Embodiments seamlessly combine cross-lingual pretraining objectives and cross-modal pretraining objectives in a unified framework to learn image and text in a joint embedding space from available English image-caption data, monolingual corpus, and parallel corpus. Embodiments are shown to achieve state-of-the-art performance in retrieval tasks on multimodal multilingual image caption datasets.


Find Patent Forward Citations

Loading…