For the Inventor, By the Inventor

The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.

The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.

Patent No.:

US 12112198 B1

Date of Patent:

Oct. 08, 2024

Filed:

Dec. 15, 2022

Asynchronous distributed data flow for machine learning workloads

Applicant:

Google Llc, Mountain View, CA (US);

Inventors:

Jeffrey Adgate Dean, Palo Alto, CA (US);

Sudip Roy, San Jose, CA (US);

Michael Acheson Isard, San Francisco, CA (US);

Aakanksha Chowdhery, Mountain View, CA (US);

Brennan Saeta, Kirkland, WA (US);

Chandramohan Amyangot Thekkath, Palo Alto, CA (US);

Daniel William Hurt, Westminster, CO (US);

Hyeontaek Lim, Palo Alto, CA (US);

Laurent El Shafey, Mountain View, CA (US);

Parker Edward Schuh, Mountain View, CA (US);

Paul Ronald Barham, San Francisco, CA (US);

Ruoming Pang, New York, NY (US);

Ryan Sepassi, Palo Alto, CA (US);

Sanjay Ghemawat, Mountain View, CA (US);

Yonghui Wu, Fremont, CA (US);

Assignee:

Google LLC, Mountain View, CA (US);

Attorney:

Fish & Richardson P.C.

Primary Examiner:

Cheng Yuan Tseng

Int. Cl.

CPC ...

G06F 17/10 (2006.01); G06F 9/48 (2006.01); G06N 3/063 (2023.01); G06N 3/08 (2023.01);

U.S. Cl.

CPC ...

G06F 9/4881 (2013.01); G06N 3/063 (2013.01); G06N 3/08 (2013.01);

Abstract

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for distributing machine learning workloads, e.g., computations for training a neural network or computing an inference using a neural network, across multiple hardware accelerators. One of the systems comprises a plurality of accelerator islands, each hardware accelerator island comprising a respective plurality of hardware devices that include a plurality of hardware accelerators and a corresponding host for each of the plurality of hardware accelerators; and a respective scheduler for each of the accelerator islands that is configured to schedule workloads across the plurality of accelerators and corresponding hosts in the accelerator island, wherein the system is configured to: receive data representing a machine learning workload; and assign a respective portion of the machine learning workload to each of the plurality of accelerator islands for scheduling by the respective scheduler for the accelerator island.

Find Patent Forward Citations

Loading…