The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.

The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.

Patent No.:

US 10956672 B1

Date of Patent:

Mar. 23, 2021

Filed:

Dec. 19, 2018

High volume message classification and distribution

Applicants:

Ron Ben-natan, Lexington, MA (US);

Derek Difilippo, Lexington, MA (US);

Uri Hershenhorn, Lexington, MA (US);

Roman Krashanitsa, Lexington, MA (US);

Luigi Labigalini, Lexington, MA (US);

Ury Segal, Vancouver, CA;

Inventors:

Ron Ben-Natan, Lexington, MA (US);

Derek Difilippo, Lexington, MA (US);

Uri Hershenhorn, Lexington, MA (US);

Roman Krashanitsa, Lexington, MA (US);

Luigi Labigalini, Lexington, MA (US);

Ury Segal, Vancouver, CA;

Assignee:

Imperva, Inc., San Mateo, CA (US);

Attorney:

Armis IP Law, LLC

Primary Examiner:

Azam M Cheema

Int. Cl.

CPC ...

G06F 7/00 (2006.01); G06F 40/284 (2020.01); G06F 16/28 (2019.01); G06N 20/20 (2019.01); G06F 40/216 (2020.01); G06F 40/242 (2020.01);

U.S. Cl.

CPC ...

G06F 40/284 (2020.01); G06F 16/285 (2019.01); G06F 40/216 (2020.01); G06F 40/242 (2020.01); G06N 20/20 (2019.01);

Abstract

A log message classifier employs machine learning for identifying a corresponding parser for interpreting the incoming log message and for retraining a classification logic model processing the incoming log messages. Voluminous log messages generate a large amount of data, typically in a text form. Data fields are parseable from the message by a parser that knows a format of the message. The classification logic is trained by a set of messages having a known format for defining groups of messages recognizable by a corresponding parser. The classification logic is defined by a random forest that outputs a corresponding group and confidence value for each incoming message. Groups may be split to define new groups based on a recurring matching tail (latter portion) of the incoming messages. A trend of decreased confidence scores triggers a periodic retraining of the random forest, and may also generate an alert to operators.

Find Patent Forward Citations