The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.
The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.
Patent No.:
Date of Patent:
Nov. 24, 2015
Filed:
Feb. 17, 2011
Srikanth Thirumalai, Clyde Hill, WA (US);
Aswath Manoharan, Bellevue, WA (US);
Mark J. Tomko, Seattle, WA (US);
Grant M. Emery, Seattle, WA (US);
Vijai Mohan, Bellevue, WA (US);
Srikanth Thirumalai, Clyde Hill, WA (US);
Aswath Manoharan, Bellevue, WA (US);
Mark J. Tomko, Seattle, WA (US);
Grant M. Emery, Seattle, WA (US);
Vijai Mohan, Bellevue, WA (US);
Amazon Technologies, Inc., Reno, NV (US);
Abstract
According to aspects of the disclosed subject matter, a method for identifying a set of documents from a document corpus that are potential duplicates of a source document, is provided. A source document is obtained. A list of queries corresponding to the source document is identified. Each query in the identified list of queries is executed on the document corpus, wherein the execution of each query yields a corresponding results set identifying an ordered set of documents in the document corpus. For each document identified in each results set, a document score is generated for the identified document based on the identified document's ordinal position in its results set. A subset of the identified documents of the results set is selected according to the generated document scores that satisfy predetermined selection criteria. The selected subset of identified documents are stored or displayed.