The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.

The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.

Date of Patent:
Sep. 27, 2022

Filed:

Mar. 13, 2020
Applicant:

International Business Machines Corporation, Armonk, NY (US);

Inventors:

Zhong Fang Yuan, Xi'an, CN;

Guang Qing Zhong, Beijing, CN;

Tong Liu, Xi'an, CN;

De Shuo Kong, Beijing, CN;

Yi Ming Wang, Xi'an, CN;

Attorney:
Primary Examiner:
Int. Cl.
CPC ...
G06V 30/414 (2022.01); G06K 9/62 (2022.01); G06N 20/00 (2019.01);
U.S. Cl.
CPC ...
G06V 30/414 (2022.01); G06K 9/6218 (2013.01); G06N 20/00 (2019.01);
Abstract

An approach for extracting non-textual data from an electronic document is disclosed. The approach includes receiving a request to extract a file and converting the file into pixels. The approach creates a pixel map of the converted file and determines one or more density clusters of the pixel map based on image clustering method. Furthermore, the approach determines one or more coordinates of the one or more density clusters and determines one or more candidate information regions based on the one or more coordinates, density of the one or more density clusters. Finally, the approach extracts one or more textual data based on the one or more candidate information regions and outputs the extracted one or more textual data.


Find Patent Forward Citations

Loading…