The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.

The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.

Date of Patent:
Nov. 24, 2015

Filed:

Feb. 17, 2011
Applicants:

Srikanth Thirumalai, Clyde Hill, WA (US);

Aswath Manoharan, Bellevue, WA (US);

Mark J. Tomko, Seattle, WA (US);

Grant M. Emery, Seattle, WA (US);

Vijai Mohan, Bellevue, WA (US);

Inventors:

Srikanth Thirumalai, Clyde Hill, WA (US);

Aswath Manoharan, Bellevue, WA (US);

Mark J. Tomko, Seattle, WA (US);

Grant M. Emery, Seattle, WA (US);

Vijai Mohan, Bellevue, WA (US);

Assignee:

Amazon Technologies, Inc., Reno, NV (US);

Attorneys:
Primary Examiner:
Int. Cl.
CPC ...
G06F 17/30 (2006.01);
U.S. Cl.
CPC ...
G06F 17/30528 (2013.01); G06F 17/30483 (2013.01);
Abstract

According to aspects of the disclosed subject matter, a method for identifying a set of documents from a document corpus that are potential duplicates of a source document, is provided. A source document is obtained. A list of queries corresponding to the source document is identified. Each query in the identified list of queries is executed on the document corpus, wherein the execution of each query yields a corresponding results set identifying an ordered set of documents in the document corpus. For each document identified in each results set, a document score is generated for the identified document based on the identified document's ordinal position in its results set. A subset of the identified documents of the results set is selected according to the generated document scores that satisfy predetermined selection criteria. The selected subset of identified documents are stored or displayed.


Find Patent Forward Citations

Loading…