The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.

The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.

Date of Patent:
Feb. 07, 2023

Filed:

Oct. 31, 2019
Applicant:

International Business Machines Corporation, Armonk, NY (US);

Inventors:

Ilyas Mohamed Iyoob, Pflugerville, TX (US);

Krishna Teja Rekapalli, Austin, TX (US);

Aly Megahed, San Jose, CA (US);

Attorney:
Primary Examiner:
Int. Cl.
CPC ...
G06N 3/08 (2006.01); G06N 3/04 (2006.01); G06F 21/62 (2013.01); G06K 9/62 (2022.01); G06F 40/205 (2020.01);
U.S. Cl.
CPC ...
G06N 3/08 (2013.01); G06F 21/6245 (2013.01); G06F 40/205 (2020.01); G06K 9/6267 (2013.01); G06N 3/0445 (2013.01);
Abstract

Computer systems, methods and program products for automating pseudonymization of personal identifying information (PII) using machine learning, metadata, and crowdsourcing patterns to identify and replace PII. Machine learning models are trained for classifying known column names or key names for processing, using metadata. Column or key names are classified to be unprocessed, anonymized or pseudonymized by a pseudonymizer without revealing PII or scrubbing data into a useless format. A library of crowdsourced patterns are utilized for matching PII to data values within column or key names and PII is mapped to replacement methods. Feedback from user annotations retrains the algorithms to improve classification accuracy and Deep Learning algorithms automate the identification of PII using regular expression generation to concisely articulate how pseudonymizers search for PII patterns within a data set. PII replacement is mapped consistently across entire data packages and the crowdsourced pattern library is updated with generated regular expressions.


Find Patent Forward Citations

Loading…