The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.

The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.

Date of Patent:
Dec. 07, 2021

Filed:

Jun. 27, 2019
Applicant:

Evernote Corporation, Redwood City, CA (US);

Inventors:

Alexander Pashintsev, Cupertino, CA (US);

Boris Gorbatov, Sunnyvale, CA (US);

Eugene Livshitz, San Mateo, CA (US);

Vitaly Glazkov, Moscow, RU;

Assignee:

EVERNOTE CORPORATION, Redwood City, CA (US);

Attorney:
Primary Examiner:
Int. Cl.
CPC ...
G06K 9/00 (2006.01); G06K 9/52 (2006.01); G06T 7/60 (2017.01); G06T 3/40 (2006.01);
U.S. Cl.
CPC ...
G06K 9/00456 (2013.01); G06K 9/00463 (2013.01); G06K 9/52 (2013.01); G06T 3/40 (2013.01); G06T 7/60 (2013.01);
Abstract

Methods and systems for training a neural network to distinguish between text documents and image documents are described. A corpus of text and image documents is obtained. A page of a text document is scanned by shifting a text window to a plurality of locations. In accordance with a determination that the text in the window at a respective location meets text line criteria, the text in the window is stored as a respective text snippet. A plurality of image windows are superimposed over at least one page of an image document. In accordance with a determination that the content of a respective image window meets image criteria, content of the image window is stored as a respective image snippet. The respective text snippet and the respective image snippet are provided to a classifier.


Find Patent Forward Citations

Loading…