The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.

The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.

Date of Patent:
Sep. 17, 2019

Filed:

Dec. 06, 2017
Applicant:

Druva Technologies Pte. Ltd., Singapore, SG;

Inventor:

Bhave Adwait, Pune, IN;

Assignee:
Attorney:
Primary Examiner:
Int. Cl.
CPC ...
G06F 17/27 (2006.01); G06F 16/33 (2019.01); G06F 16/34 (2019.01);
U.S. Cl.
CPC ...
G06F 16/3344 (2019.01); G06F 16/345 (2019.01); G06F 17/2755 (2013.01); G06F 17/2775 (2013.01); G06F 17/2785 (2013.01);
Abstract

A keyphrase extraction system and method is provided. The keyphrase extraction system includes a memory having computer-readable instructions stored therein. The keyphrase extraction system also includes a processor configured to access a document. The processor is configured to identify a plurality of candidate phrases from the document based upon a part-of-speech tag pattern. Each of the plurality of candidate phrases comprises one or more candidate terms. In addition, the processor is further configured to access an external knowledge base to determine a vocabulary frequency count of the one or more candidate terms. The vocabulary frequency count of the one or more candidate terms corresponds to a count of appearance of the respective candidate term in a plurality of documents accessible by the external knowledge base. Further, the processor is configured to estimate a phrase score for each of the plurality of candidate phrases based upon the vocabulary frequency count of the one or more candidate terms of each of the plurality of candidate phrases. Furthermore, the processor is configured to filter the plurality of candidate phrases based upon the estimated phrase score and pre-determined thresholds to determine one or more key phrases present in the document.


Find Patent Forward Citations

Loading…