The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.

The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.

Date of Patent:
Nov. 16, 2021

Filed:

Aug. 22, 2016
Applicant:

Intel Corporation, Santa Clara, CA (US);

Inventor:

Annapurna Dasari, Fremont, CA (US);

Assignee:

Intel Corporation, Santa Clara, CA (US);

Attorney:
Primary Examiner:
Int. Cl.
CPC ...
G06F 11/00 (2006.01); G06F 11/07 (2006.01); G06F 11/30 (2006.01);
U.S. Cl.
CPC ...
G06F 11/079 (2013.01); G06F 11/008 (2013.01); G06F 11/0709 (2013.01); G06F 11/0721 (2013.01); G06F 11/0748 (2013.01); G06F 11/0751 (2013.01); G06F 11/0772 (2013.01); G06F 11/0781 (2013.01); G06F 11/0793 (2013.01); G06F 11/3006 (2013.01);
Abstract

Systems, apparatuses, and/or methods may manage a fault condition in a computer system. An apparatus may dynamically publish a message over a publisher-subscriber system and dynamically subscribe to a message over the publisher-subscriber system, wherein at least one message may be used to address a fault condition in the computing system. The apparatus may predict a fault condition in a high performance computing (HPC) system, communicate fault information to a user, monitor health of the HPC system, respond to the fault condition in the HPC system, recover from the fault condition in the HPC system, maintain a rule for a fault management component, and/or communicate the fault information over the publisher-subscriber system in real-time. Messages may also be aggregated to minimize fault information traffic. The publisher-subscriber system may facilitate dynamic and/or real-time coordinated, integrated (e.g., system-wide), and/or scalable fault management.


Find Patent Forward Citations

Loading…