The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.

The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.

Date of Patent:
Sep. 06, 2022

Filed:

Jun. 03, 2020
Applicant:

Beijing Xiaomi Mobile Software Co., Ltd., Beijing, CN;

Inventors:

Jingwei Li, Beijing, CN;

Yuhui Sun, Beijing, CN;

Xiang Li, Beijing, CN;

Attorney:
Primary Examiner:
Int. Cl.
CPC ...
G06F 40/58 (2020.01); G06F 40/51 (2020.01); G06F 40/44 (2020.01); G06F 40/263 (2020.01); G06N 3/08 (2006.01);
U.S. Cl.
CPC ...
G06F 40/58 (2020.01); G06F 40/263 (2020.01); G06F 40/44 (2020.01); G06F 40/51 (2020.01); G06N 3/08 (2013.01);
Abstract

A bilingual corpora screening method includes: acquiring multiple pairs of bilingual corpora, wherein each pair of the bilingual corpora comprises a source corpus and a target corpus; training a machine translation model based on the multiple pairs of bilingual corpora; obtaining a first feature of each pair of bilingual corpora based on the trained machine translation model; training a language model based on the multiple pairs of bilingual corpora; obtaining feature vectors of each pair of bilingual corpora and determining a second feature of each pair of bilingual corpora based on the trained language model; determining a quality value of each pair of bilingual corpora according to the first feature and the second feature of each pair of bilingual corpora; and screening each pair of bilingual corpora according to the quality value of each pair of bilingual corpora.


Find Patent Forward Citations

Loading…