The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.
The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.
Patent No.:
Date of Patent:
Dec. 25, 2007
Filed:
Feb. 25, 2002
Gyan Bhanot, Princeton, NJ (US);
Matthias A. Blumrich, Ridgefield, CT (US);
Dong Chen, Croton On Hudson, NY (US);
Alan G. Gara, Mount Kisco, NY (US);
Mark E. Giampapa, Irvington, NY (US);
Philip Heidelberger, Cortlandt Manor, NY (US);
Burkhard D. Steinmacher-burow, Mount Kisco, NY (US);
Pavlos M. Vranas, Bedford Hills, NY (US);
Gyan Bhanot, Princeton, NJ (US);
Matthias A. Blumrich, Ridgefield, CT (US);
Dong Chen, Croton On Hudson, NY (US);
Alan G. Gara, Mount Kisco, NY (US);
Mark E. Giampapa, Irvington, NY (US);
Philip Heidelberger, Cortlandt Manor, NY (US);
Burkhard D. Steinmacher-Burow, Mount Kisco, NY (US);
Pavlos M. Vranas, Bedford Hills, NY (US);
International Business Machines Corporation, Armonk, NY (US);
Abstract
Methods and systems for performing arithmetic functions. In accordance with a first aspect of the invention, methods and apparatus are provided, working in conjunction of software algorithms and hardware implementation of class network routing, to achieve a very significant reduction in the time required for global arithmetic operation on the torus. Therefore, it leads to greater scalability of applications running on large parallel machines. The invention involves three steps in improving the efficiency and accuracy of global operations: (1) Ensuring, when necessary, that all the nodes do the global operation on the data in the same order and so obtain a unique answer, independent of roundoff error; (2) Using the topology of the torus to minimize the number of hops and the bidirectional capabilities of the network to reduce the number of time steps in the data transfer operation to an absolute minimum; and (3) Using class function routing to reduce latency in the data transfer. With the method of this invention, every single element is injected into the network only once and it will be stored and forwarded without any further software overhead. In accordance with a second aspect of the invention, methods and systems are provided to efficiently implement global arithmetic operations on a network that supports the global combining operations. The latency of doing such global operations are greatly reduced by using these methods.