The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.

The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.

Date of Patent:
Aug. 19, 2025

Filed:

Nov. 29, 2022
Applicant:

Microsoft Technology Licensing, Llc, Redmond, WA (US);

Inventors:

Osaid Rehman Nasir, New Delhi, IN;

Bharat Kumar Jain, Hyderabad, IN;

Smitkumar Narotambhai Marvaniya, Bangalore, IN;

Assignee:
Attorney:
Primary Examiner:
Int. Cl.
CPC ...
G06V 30/24 (2022.01); G06F 40/40 (2020.01); G06V 30/14 (2022.01); G06V 30/19 (2022.01); G06V 30/262 (2022.01);
U.S. Cl.
CPC ...
G06V 30/2528 (2022.01); G06F 40/40 (2020.01); G06V 30/1448 (2022.01); G06V 30/19147 (2022.01); G06V 30/274 (2022.01);
Abstract

Technologies for language agnostic OCR extraction include identifying a word region of an image using optical character recognition, applying a language agnostic machine learning model to the word region, where the language agnostic machine learning model is trained on training data including a set of image-text pairs and a set of multilingual text translation pairs, receiving, from the language agnostic machine learning model, a word region embedding that is associated with the word region, searching a multilingual index for a text embedding that matches the word region embedding, receiving, from the multilingual index, text associated with the text embedding; and outputting at least one of the text or the text embedding to at least one downstream process, application, system, component, or network.


Find Patent Forward Citations

Loading…