The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.

The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.

Date of Patent:
Jun. 06, 2023

Filed:

May. 20, 2021
Applicant:

International Business Machines Corporation, Armonk, NY (US);

Inventors:

Richard Obinna Osuala, Munich, DE;

Christopher M. Lohse, Stuttgart, DE;

Ben J. Schaper, Stuttgart, DE;

Marcell Streile, Knetzgau, DE;

Charles E. Beller, Baltimore, MD (US);

Attorney:
Primary Examiner:
Int. Cl.
CPC ...
G06F 40/279 (2020.01); G06N 3/08 (2023.01);
U.S. Cl.
CPC ...
G06F 40/279 (2020.01); G06N 3/08 (2013.01);
Abstract

A computer assigns a similarity value to a comparison document. The computer receives, reference document contextual word embeddings in first set of topic clusters, each with a representative embedding. The computer receives comparison document contextual word embeddings. The computer determines, using a trained neural network model classifier, for each comparison document contextual word embedding, topic correspondence values relative to the representative embeddings of said first set of clusters. The computer generates a second set of clusters by assigning comparison document embeddings to best matching one of the first clusters, according to the topic correspondence values. The computer determines a second set of representative embeddings and uses a comparison method, to determine a cluster similarity value for second set clusters compared to first set representative embeddings. The computer determines document similarity values based, at least in part, on at least one of cluster similarity values.


Find Patent Forward Citations

Loading…