The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.
The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.
Patent No.:
Date of Patent:
Dec. 15, 2020
Filed:
Sep. 29, 2016
Deepmind Technologies Limited, London, GB;
Thore Kurt Hartwig Graepel, Cambridge, GB;
Shih-Chieh Huang, London, GB;
David Silver, Hitchin, GB;
Arthur Clement Guez, London, GB;
Laurent Sifre, Paris, FR;
Ilya Sutskever, San Francisco, CA (US);
Christopher Maddison, Toronto, CA;
DeepMind Technologies Limited, Mountain View, CA (US);
Abstract
Methods, systems and apparatus, including computer programs encoded on computer storage media, for training a value neural network that is configured to receive an observation characterizing a state of an environment being interacted with by an agent and to process the observation in accordance with parameters of the value neural network to generate a value score. One of the systems performs operations that include training a supervised learning policy neural network; initializing initial values of parameters of a reinforcement learning policy neural network having a same architecture as the supervised learning policy network to the trained values of the parameters of the supervised learning policy neural network; training the reinforcement learning policy neural network on second training data; and training the value neural network to generate a value score for the state of the environment that represents a predicted long-term reward resulting from the environment being in the state.