The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.

The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.

Date of Patent:
Jun. 11, 2024

Filed:

Dec. 23, 2021
Applicant:

Microsoft Technology Licensing, Llc, Redmond, WA (US);

Inventors:

Xun Luan, Sunnyvale, CA (US);

Aman Gupta, San Jose, CA (US);

Sirjan Kafle, San Diego, CA (US);

Ananth Sankar, Palo Alto, CA (US);

Di Wen, Sunnyvale, CA (US);

Saurabh Kataria, Newark, CA (US);

Ying Xuan, Sunnyvale, CA (US);

Sakshi Verma, Haryana, IN;

Bharat Kumar Jain, Hyderabad, IN;

Xue Xia, Los Angeles, CA (US);

Bhargavkumar Kanubhai Patel, Gujarat, IN;

Vipin Gupta, Bangalore, IN;

Nikita Gupta, Delhi, IN;

Assignee:
Attorney:
Primary Examiner:
Int. Cl.
CPC ...
G10L 15/22 (2006.01); G06F 40/40 (2020.01); G06N 3/04 (2023.01); G06V 30/19 (2022.01);
U.S. Cl.
CPC ...
G06F 40/40 (2020.01); G06N 3/04 (2013.01); G06V 30/19147 (2022.01);
Abstract

Described herein are systems and methods for generating an embedding—a learned representation—for an image. The embedding for the image is derived to capture visual aspects, as well as textual aspects, of the image. An encoder-decoder is trained to generate the visual representation of the image. An optical character recognition (OCR) algorithm is used to identify text/words in the image. From these words, an embedding is derived by performing an average pooling operation on pre-trained embeddings that map to the identified words. Finally, the embedding representing the visual aspects of the image is combined with the embedding representing the textual aspects of the image to generate a final embedding for the image.


Find Patent Forward Citations

Loading…