The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.

The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.

Date of Patent:
Jul. 08, 2008

Filed:

Apr. 27, 2005
Applicant:

Erik S. Ruf, Kirkland, WA (US);

Inventor:

Erik S. Ruf, Kirkland, WA (US);

Assignee:

Microsoft Corporation, Redmond, WA (US);

Attorney:
Primary Examiner:
Assistant Examiner:
Int. Cl.
CPC ...
G06F 17/50 (2006.01); G06F 12/00 (2006.01); G06F 13/00 (2006.01); G06F 13/28 (2006.01);
U.S. Cl.
CPC ...
Abstract

A schedule can be generated for physically transposing an array such that when the array is transferred from a first memory type to a second memory type, the number of block transfers performed is minimized. The array can be rearranged to ensure that most or all data elements in any block read into internal memory are used before that block's internal storage is reused. The algorithm can compute an offline schedule and then execute that schedule. The method can assemble the schedule during one or more passes with an algorithm. Scheduling passes can apply a permutation to a representation of the array's structure and then tile the representation to ensure efficient use of internal memory. Tiling may alter the permutation, so the algorithm can be reinvoked and run on the tiled representation. The algorithm can be run on successively retiled representations until no additional tiling is required. Optimizations for schedule generation and/or execution include: simplification of the data structure representing the array; alteration of the order in which tiles are accessed; inversion of permutations; processing non-canonical input; and the production of stream output. The method can also be modified for specialized architectures, including: block I/O architecture; fully associative caches; set-associative caches; and multi-level memory hierarchies.


Find Patent Forward Citations

Loading…