The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.
The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.
Patent No.:
Date of Patent:
Jan. 22, 2013
Filed:
Sep. 18, 2009
Bing Bai, Plainsboro, NJ (US);
Jason Weston, New York, NY (US);
Ronan Collorbert, Princeton, NJ (US);
David Grangier, Princeton, NJ (US);
Bing Bai, Plainsboro, NJ (US);
Jason Weston, New York, NY (US);
Ronan Collorbert, Princeton, NJ (US);
David Grangier, Princeton, NJ (US);
NEC Laboratories America, Inc., Princeton, NJ (US);
Abstract
A system and method for determining a similarity between a document and a query includes providing a frequently used dictionary and an infrequently used dictionary in storage memory. For each word or gram in the infrequently used dictionary, n words or grams are correlated from the frequently used dictionary based on a first score. Features for a vector of the infrequently used words or grams are replaced with features from a vector of the correlated words or grams from the frequently used dictionary when the features from a vector of the correlated words or grams meet a threshold value. A similarity score is determined between weight vectors of a query and one or more documents in a corpus by employing the features from the vector of the correlated words or grams that met the threshold value.