The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.

The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.

Date of Patent:
Oct. 03, 2017

Filed:

Sep. 24, 2015
Applicant:

Oracle International Corporation, Redwood Shores, CA (US);

Inventors:

Michael Louis Wick, Medford, MA (US);

Pallika Haridas Kanani, Westford, MA (US);

Adam Craig Pocock, Burlington, MA (US);

Assignee:

ORACLE INTERNATIONAL CORPORATION, Redwood Shores, CA (US);

Attorney:
Primary Examiner:
Int. Cl.
CPC ...
G06F 17/28 (2006.01); G06F 17/27 (2006.01); G10L 25/00 (2013.01);
U.S. Cl.
CPC ...
G06F 17/2818 (2013.01); G06F 17/2735 (2013.01);
Abstract

A natural language processing ('NLP') manager is provided that manages NLP model training. An unlabeled corpus of multilingual documents is provided that span a plurality of target languages. A multilingual embedding is trained on the corpus of multilingual documents as input training data, the multilingual embedding being generalized across the target languages by modifying the input training data and/or transforming multilingual dictionaries into constraints in an underlying optimization problem. An NLP model is trained on training data for a first language of the target languages, using word embeddings of the trained multilingual embedding as features. The trained NLP model is applied for data from a second of the target languages, the first and second languages being different.


Find Patent Forward Citations

Loading…