The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.

The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.

Date of Patent:
Jul. 01, 2025

Filed:

Dec. 30, 2022
Applicant:

International Business Machines Corporation, Armonk, NY (US);

Inventors:

Zhong Fang Yuan, Xi'an, CN;

Tong Liu, Xi'an, CN;

Si Tong Zhao, Beijing, CN;

Xiang Yu Yang, Xi'an, CN;

Ziqiumin Wang, Shanghai, CN;

Attorney:
Primary Examiner:
Int. Cl.
CPC ...
G06F 17/00 (2019.01); G06F 18/214 (2023.01); G06F 18/2413 (2023.01); G06F 40/30 (2020.01); G06V 30/19 (2022.01);
U.S. Cl.
CPC ...
G06F 40/30 (2020.01); G06F 18/214 (2023.01); G06F 18/2413 (2023.01); G06V 30/19093 (2022.01);
Abstract

Information extraction and image restructuring includes generating semantic vectors to encode portions of text extracted from a document. For each semantic vector a semantic similarity between a schema key and other text encoded therein is determined based on their respective positions within the document. An enhanced NLP model is created using the semantic vectors, each labeled according to the semantic similarity. The text, including schema key, are re-encoded as a key and candidate vectors. Key-value pairs are generated by matching the key vector with a predetermined number of candidate vectors. The enhanced NLP model, using prompt learning, is repurposed to perform a next-sentence prediction that predicts which of the candidate vectors is logically related to the schema key. Based on the next-sentence prediction, the discrete portion of text identified as the schema key and portion of text determined to be logically related thereto are output.


Find Patent Forward Citations

Loading…