The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.

The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.

Date of Patent:
Nov. 12, 2024

Filed:

Jun. 25, 2020
Applicant:

Deepmind Technologies Limited, London, GB;

Inventors:

David Silver, Hitchin, GB;

Tom Schaul, London, GB;

Matteo Hessel, London, GB;

Hado Philip van Hasselt, London, GB;

Assignee:
Attorney:
Primary Examiner:
Int. Cl.
CPC ...
G06N 3/045 (2023.01); G05B 13/02 (2006.01); G06N 3/006 (2023.01); G06N 3/044 (2023.01); G06N 3/047 (2023.01); G06N 3/08 (2023.01); G06N 3/10 (2006.01); G06T 1/20 (2006.01); G06N 3/084 (2023.01);
U.S. Cl.
CPC ...
G06N 3/045 (2023.01); G05B 13/027 (2013.01); G06N 3/006 (2013.01); G06N 3/044 (2023.01); G06N 3/047 (2023.01); G06N 3/08 (2013.01); G06N 3/10 (2013.01); G06T 1/20 (2013.01); G06N 3/084 (2013.01);
Abstract

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for prediction of an outcome related to an environment. In one aspect, a system comprises a state representation neural network that is configured to: receive an observation characterizing a state of an environment being interacted with by an agent and process the observation to generate an internal state representation of the environment state; a prediction neural network that is configured to receive a current internal state representation of a current environment state and process the current internal state representation to generate a predicted subsequent state representation of a subsequent state of the environment and a predicted reward for the subsequent state; and a value prediction neural network that is configured to receive a current internal state representation of a current environment state and process the current internal state representation to generate a value prediction.


Find Patent Forward Citations

Loading…