The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.

The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.

Date of Patent:
Nov. 07, 2017

Filed:

Jun. 26, 2016
Applicant:

Abbyy Development Llc, Moscow, RU;

Assignee:

ABBYY DEVELOPMENT LLC, Moscow, RU;

Attorney:
Primary Examiner:
Int. Cl.
CPC ...
G06K 9/00 (2006.01); G06F 17/27 (2006.01); G06K 9/32 (2006.01); G06F 17/22 (2006.01); G06K 9/68 (2006.01); G06K 9/18 (2006.01);
U.S. Cl.
CPC ...
G06K 9/00456 (2013.01); G06F 17/2223 (2013.01); G06F 17/275 (2013.01); G06F 17/2775 (2013.01); G06K 9/18 (2013.01); G06K 9/3208 (2013.01); G06K 9/6821 (2013.01); G06K 2209/011 (2013.01);
Abstract

Disclosed are systems, computer-readable mediums, and methods for determining that text contains Chinese, Japanese, or Korean characters. One method includes determining a language hypothesis for each text fragment in a plurality of text fragments identified from connected components in a document image. The method further includes selecting a first subset of text fragments from the plurality of text fragments based on ratings for the language hypothesis of each text fragment in the plurality of text fragments. The method further includes verifying, by a processor, the language hypothesis of one or more text fragments in the first subset of text fragments based on optical character recognition of the one or more text fragments. The method further includes determining, by the processor, that Chinese, Japanese, or Korean (CJK) characters are present in the document image based on the verification of the language hypothesis of each of the one or more text fragments.


Find Patent Forward Citations

Loading…