The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.
The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.
Patent No.:
Date of Patent:
Feb. 25, 2025
Filed:
Dec. 22, 2020
Microsoft Technology Licensing, Llc, Redmond, WA (US);
Ji Li, San Jose, CA (US);
Amit Srivastava, San Jose, CA (US);
Microsoft Technology Licensing, LLC, Redmond, WA (US);
Abstract
A data processing system for generating training data for a multilingual NLP model implements obtaining a corpus including first and second content items. The first content items are English-language textual content, and the second content items are translations of the first content items in one or more non-English target languages. The system further implements selecting a first content item from the first content items, generating a plurality of candidate labels for the first content item by analyzing the first content item with a plurality of first English-language NLP models, selecting a first label from the plurality of candidate labels, generating first training data by associating the first label with the first content item, generating second training data by associating the first label with a second content item of the second content items, and training a pretrained multilingual NLP model with the first training data and the second training data.