The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.
The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.
Patent No.:
Date of Patent:
Jun. 17, 2008
Filed:
Jul. 20, 2004
Hinrich H. Schuetze, San Francisco, CA (US);
Chia-hao Yu, Davis, CA (US);
Omer Emre Velipasaoglu, San Francisco, CA (US);
Stan Stukov, Hillsborough, CA (US);
Hinrich H. Schuetze, San Francisco, CA (US);
Chia-Hao Yu, Davis, CA (US);
Omer Emre Velipasaoglu, San Francisco, CA (US);
Stan Stukov, Hillsborough, CA (US);
ENKATA Technologies, Inc., San Mateo, CA (US);
Abstract
A method for processing semi-structured data. The method includes receiving semi-structured data into a first format from a real business process. Preferably, the semi-structured data are machine generated. The method includes tokenizing the semi-structured data into a second format and storing the semi-structured data in the second format into one or more memories and clustering the tokenized data to form a plurality of clusters. The method also includes identifying a selected low frequency term in each of the clusters, and processing at least two of the clusters and the associated selected low frequency terms to form a single template for the at least two of the clusters. In a preferred embodiment, the method replaces the selected low frequency term with a wild card character.