The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.
The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.
Patent No.:
Date of Patent:
Apr. 24, 2012
Filed:
Jun. 27, 2007
Jacques Joseph Labrie, Sunnyvale, CA (US);
David Thomas Meeks, Ashland, MA (US);
Mary Ann Roth, San Jose, CA (US);
Yannick Saillet, Stuttgart, DE;
Jacques Joseph Labrie, Sunnyvale, CA (US);
David Thomas Meeks, Ashland, MA (US);
Mary Ann Roth, San Jose, CA (US);
Yannick Saillet, Stuttgart, DE;
International Business Machines Corporation, Armonk, NY (US);
Abstract
Provided are a method, system, and article of manufacture for using a data mining algorithm to generate format rules used to validate data sets. A data set has a plurality of columns and records providing data for each of the columns. Selection is received of at least one format column for which format rules are to be generated and selection is received of at least one predictor column. A format mask column is generated for each selected format column. For records in the data set, a value in the at least one format column is converted to a format mask representing a format of the value in the format column and storing the format mask in the format mask column in the record for which the format mask was generated. The at least one predictor column and the at least one format mask column are processed to generate at least one format rule. Each format rule specifies a format mask associated with at least one condition in the at least one predictor column.