The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.
The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.
Patent No.:
Date of Patent:
Aug. 19, 2025
Filed:
Nov. 29, 2022
Microsoft Technology Licensing, Llc, Redmond, WA (US);
Osaid Rehman Nasir, New Delhi, IN;
Bharat Kumar Jain, Hyderabad, IN;
Smitkumar Narotambhai Marvaniya, Bangalore, IN;
Microsoft Technology Licensing, LLC, Redmond, WA (US);
Abstract
Technologies for language agnostic OCR extraction include identifying a word region of an image using optical character recognition, applying a language agnostic machine learning model to the word region, where the language agnostic machine learning model is trained on training data including a set of image-text pairs and a set of multilingual text translation pairs, receiving, from the language agnostic machine learning model, a word region embedding that is associated with the word region, searching a multilingual index for a text embedding that matches the word region embedding, receiving, from the multilingual index, text associated with the text embedding; and outputting at least one of the text or the text embedding to at least one downstream process, application, system, component, or network.