The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.
The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.
Patent No.:
Date of Patent:
Jul. 01, 2025
Filed:
Sep. 11, 2024
Concentric Software, Inc, San Mateo, CA (US);
Madhusudana Shashanka, Austin, TX (US);
Bonnie Arogyam Varghese, Milpitas, CA (US);
Leomart Crisostomo, Sunnyvale, CA (US);
Shankar Subramaniam, Cupertino, CA (US);
Sumeet Khirwal, Jamshedpur, IN;
CONCENTRIC SOFTWARE, INC, San Mateo, CA (US);
Abstract
Methods and systems for clustering documents according to semantic similarity are disclosed. The method includes generating an embedding for each of a plurality of documents to form a plurality of embeddings. Each embedding is indicative of a semantic representation for the corresponding document. The method includes segregating the plurality of embeddings into a plurality of shards. The method includes clustering one or more embeddings within each shard of the plurality of shards into the one or more first clusters. The one or more first clusters for each shard collectively constitute a plurality of first clusters. The method includes generating a plurality of second clusters across the plurality of shards, based, at least in part, on semantic similarity between the plurality of embeddings and the plurality of first clusters.