The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.

The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.

Date of Patent:
Feb. 14, 2023

Filed:

Jun. 22, 2021
Applicant:

Microsoft Technology Licensing, Llc, Redmond, WA (US);

Inventors:

Itzik Malkiel, Givaatayim, IL;

Dvir Ginzburg, Tel Aviv, IL;

Noam Koenigstein, Tel Aviv, IL;

Oren Barkan, Tel Aviv, IL;

Nir Nice, Salit, IL;

Assignee:
Attorney:
Primary Examiner:
Int. Cl.
CPC ...
G06V 30/418 (2022.01); G06K 9/62 (2022.01); G06V 10/75 (2022.01);
U.S. Cl.
CPC ...
G06V 30/418 (2022.01); G06K 9/623 (2013.01); G06K 9/6263 (2013.01); G06V 10/751 (2022.01);
Abstract

Examples provide a self-supervised language model for document-to-document similarity scoring and ranking long documents of arbitrary length in an absence of similarity labels. In a first stage of a two-staged hierarchical scoring, a sentence similarity matrix is created for each paragraph in the candidate document. A sentence similarity score is calculated based on the sentence similarity matrix. In the second stage, a paragraph similarity matrix is constructed based on aggregated sentence similarity scores associated with the first candidate document. A total similarity score for the document is calculated based on the normalize the paragraph similarity matrix for each candidate document in a collection of documents. The model is trained using a masked language model and intra-and-inter document sampling. The documents are ranked based on the similarity scores for the documents.


Find Patent Forward Citations

Loading…