The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.

The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.

Date of Patent:
Sep. 19, 2017

Filed:

Aug. 31, 2015
Applicant:

Ancestry.com Operations Inc., Provo, UT (US);

Inventors:

Jack Reese, Lindon, UT (US);

Michael Murdock, Provo, UT (US);

Shawn Reid, Orem, UT (US);

Laryn Brown, Highland, UT (US);

Assignee:
Attorney:
Primary Examiner:
Int. Cl.
CPC ...
G06K 9/00 (2006.01); G06K 9/18 (2006.01); G06K 9/52 (2006.01); G06K 9/62 (2006.01);
U.S. Cl.
CPC ...
G06K 9/00463 (2013.01); G06K 9/00456 (2013.01); G06K 9/00859 (2013.01); G06K 9/18 (2013.01); G06K 9/52 (2013.01); G06K 9/6201 (2013.01); G06K 9/6215 (2013.01); G06K 9/6218 (2013.01);
Abstract

A handwriting recognition system converts word images on documents, such as document images of historical records, into computer searchable text. Word images (snippets) on the document are located, and have multiple word features identified. For each word image, a word feature vector is created representing multiple word features. Based on the similarity of word features (e.g., the distance between feature vectors), similar words are grouped together in clusters, and a centroid that has features most representative of words in the cluster is selected. A digitized text word is selected for each cluster based on review of a centroid in the cluster, and is assigned to all words in that cluster and is used as computer searchable text for those word images where they appear in documents. An analyst may review clusters to permit refinement of the parameters used for grouping words in clusters, including the adjustment of weights and other factors used for determining the distance between feature vectors.


Find Patent Forward Citations

Loading…