The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.

The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.

Date of Patent:
Aug. 09, 2022

Filed:

May. 11, 2018
Applicant:

Accenture Global Solutions Limited, Dublin, IE;

Inventors:

Fang Hou, Beijing, CN;

Yikai Wu, Beijing, CN;

Xiaopei Cheng, Beijing, CN;

Sifei Ding, Beijing, CN;

Assignee:
Attorney:
Primary Examiner:
Int. Cl.
CPC ...
G06F 16/35 (2019.01); G06K 9/62 (2022.01); G06N 20/00 (2019.01); G06F 16/36 (2019.01); G06N 20/10 (2019.01); G06N 20/20 (2019.01); G06V 30/40 (2022.01); G06V 30/148 (2022.01); G06V 30/242 (2022.01); G06N 5/00 (2006.01); G06N 5/04 (2006.01);
U.S. Cl.
CPC ...
G06F 16/353 (2019.01); G06F 16/374 (2019.01); G06K 9/628 (2013.01); G06K 9/6256 (2013.01); G06N 20/00 (2019.01); G06N 20/10 (2019.01); G06N 20/20 (2019.01); G06V 30/153 (2022.01); G06V 30/242 (2022.01); G06V 30/40 (2022.01); G06N 5/003 (2013.01); G06N 5/046 (2013.01);
Abstract

An iterative classifier for unsegmented electronic documents is based on machine learning algorithms. The textual strings in the electronic document are segmented using a composite dictionary that combines a conventional dictionary and an adaptive dictionary developed based on the context and nature of the electronic document. The classifier is built using a corpus of training and testing samples automatically extracted from the electronic document by detecting signatures for a set of pre-established classes for the textual strings. The classifier is further iteratively improved by automatically expanding the corpus of training and testing samples in real-time when textual strings in new electronic documents are processed and classified.


Find Patent Forward Citations

Loading…