The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.

The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.

Date of Patent:
Jan. 28, 2025

Filed:

Jun. 01, 2022
Applicant:

Microsoft Technology Licensing, Llc, Redmond, WA (US);

Inventors:

Sekhar Poornananda Chintalapati, Redmond, WA (US);

Vinod Kumar Yelahanka Srinivas, Bellevue, WA (US);

Dattatraya Baban Rajpure, Sammamish, WA (US);

Pieter Kristian Brouwer, Redmond, WA (US);

Gaurav Anil Yeole, Vancouver, CA;

Mihai Silviu Peicu, Redmond, WA (US);

Assignee:
Attorneys:
Primary Examiner:
Int. Cl.
CPC ...
G06F 21/62 (2013.01); G06F 16/174 (2019.01); G06F 40/284 (2020.01); G06F 40/295 (2020.01);
U.S. Cl.
CPC ...
G06F 21/6245 (2013.01); G06F 16/174 (2019.01); G06F 40/284 (2020.01); G06F 40/295 (2020.01);
Abstract

Methods and systems for detecting personally identifiable information in data associated with a cloud computing system are described. An example method includes ingesting the data associated with the cloud computing system to generate source data. The method includes processing the source data by: performing cell-based de-duplication to generate cell-based de-duplicated data, subjecting the cell-based de-duplicated data to regular expression classification to generate a first subset of initial results, tokenizing the cell-based de-duplicated data to generate tokenized data, and de-duplicating the tokenized data and subjecting de-duplicated tokenized data to a first named entity recognition classification to generate a second subset of the initial results. The method includes cross-referencing the cell-based de-duplicated data and the initial results and subjecting output of the cross-referencing to a second named entity recognition classification to generate final results. The method includes processing the final results to detect any personally identifiable information in the final results.


Find Patent Forward Citations

Loading…