The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.
The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.
Patent No.:
Date of Patent:
Jan. 19, 2021
Filed:
Jul. 07, 2015
Mark Edward Bowles, San Mateo, CA (US);
Jens Erik Tellefsen, Mountain View, CA (US);
Ranjeet Singh Bhatia, Sunnyvale, CA (US);
Mark Edward Bowles, San Mateo, CA (US);
Jens Erik Tellefsen, Mountain View, CA (US);
Ranjeet Singh Bhatia, Sunnyvale, CA (US);
NetBase Solutions, Inc., Santa Clara, CA (US);
Abstract
To the standard operations of an inverted index database, a new 'To' operator is added. The 'To' operator treats the standard single-level linear collection of records as being organized into localized clusters. Techniques for hierarchical clusters are presented. During indexing, hierarchical clusters are serialized according to a uniform visitation procedure. Serialization produces bit maps, one for each hierarchical level, that preserve the hierarchical level of each record and its location in the serialization sequence. The “To” operator accepts a list of records, each at a same hierarchical level in a cluster, and a specification of a hierarchical level that all the input records should be converted into. The “To” operator outputs a list of records, representing a conversion of the input records to the specified new level. When searching a Corpus-of-Interest for an Object-of-Interest, techniques are presented for greatly improving the process by which Exclude Terms are identified. Exclude Terms are particularly useful when the lexical units, representing an Object-of-Interest, are ambiguous. When in the mode of searching for Exclude Terms, the Object-of-Interest of interest is sought, in the Corpus-of-Interest, in a broader context than when the Exclude Terms are utilized as part of an actual query. The Object-of-Interest can match anywhere in a snippet, rather than just in the focus sentence. Using the “To” operator, the focus sentences thus found are converted into role values. Statistical sampling of the role values may be used to reduce the data for the next step of processing. The role values are subjected to frequency and cluster analysis, at the lexical unit level, in order to identify candidate Exclude Terms that a user can select. Frequency and clustering information, is presented to the user, to aid in the decision process. The search for Exclude Terms can be repeated, using the Exclude Terms located thus far.