The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.

The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.

Date of Patent:
Feb. 20, 2018

Filed:

Jan. 26, 2016
Applicant:

International Business Machines Corporation, Armonk, NY (US);

Inventors:

Alan Akbik, Berlin, DE;

Laura Chiticariu, San Jose, CA (US);

Marina Danilevsky Hailpern, San Jose, CA (US);

Yunyao Li, San Jose, CA (US);

Huaiyu Zhu, Fremont, CA (US);

Attorney:
Primary Examiner:
Assistant Examiner:
Int. Cl.
CPC ...
G06F 17/27 (2006.01); G06F 17/30 (2006.01); G06F 17/28 (2006.01);
U.S. Cl.
CPC ...
G06F 17/2854 (2013.01); G06F 17/2785 (2013.01); G06F 17/289 (2013.01); G06F 17/2827 (2013.01);
Abstract

One embodiment provides a method for generating a natural language resource using a parallel corpus, the method including: utilizing at least one processor to execute computer code that performs the steps of: receiving, from a parallel corpus, natural language text in a source language and a corresponding translation of the natural language text in a target language, wherein the natural language text in the source language comprises linguistic annotations; projecting the linguistic annotations from the source language natural language text to the target language natural language text; applying one or more filters to remove at least one projected linguistic annotation from the target language natural language text that results in at least one error; selecting at least one target language natural language text having substantially complete linguistic annotations; training a machine learning model using the selected at least one target language natural language text and annotations; and adding, using the trained machine learning model, linguistic annotations to at least one target language natural language text having incomplete linguistic annotations. Other aspects are described and claimed.


Find Patent Forward Citations

Loading…