The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.

The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.

Date of Patent:
Apr. 26, 2022

Filed:

Jun. 29, 2021
Applicant:

Alipay (Hangzhou) Information Technology Co., Ltd., Zhejiang, CN;

Inventors:

Jiawei Liu, Zhejiang, CN;

Desheng Wang, Zhejiang, CN;

Peng Zhang, Zhejiang, CN;

Qian Zhang, Zhejiang, CN;

Xi Jia, Zhejiang, CN;

Yang Liu, Zhejiang, CN;

Attorney:
Primary Examiner:
Int. Cl.
CPC ...
G06K 9/00 (2006.01); G06F 21/62 (2013.01); G06F 17/16 (2006.01); G06N 7/00 (2006.01); G06K 9/62 (2022.01); G06F 7/76 (2006.01); G06F 7/02 (2006.01);
U.S. Cl.
CPC ...
G06F 21/6254 (2013.01); G06F 7/023 (2013.01); G06F 7/76 (2013.01); G06F 17/16 (2013.01); G06K 9/6215 (2013.01); G06N 7/005 (2013.01);
Abstract

Implementations of the present specification disclose a data identification method, apparatus, device, and a computer-readable medium. A solution includes: obtaining a first data set, data samples in the first data set being at least a part of data of a to-be-identified field; obtaining a state transition matrix set generated based on statistics of data samples in a second data set, a data type of the data samples in the second data set being known; determining sample state transition probabilities corresponding to the data samples in the first data set based on the state transition matrix set; determining a ratio between a number of data samples in the first data set whose sample state transition probabilities are greater than a first threshold and a total number of the data samples in the first data set; and determining data corresponding to the to-be-identified field as being of a same data type as the data samples in the second data set in response to that the ratio is greater than a second threshold.


Find Patent Forward Citations

Loading…