The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.

The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.

Date of Patent:
Jan. 11, 2022

Filed:

May. 30, 2017
Applicant:

Xerox Corporation, Norwalk, CT (US);

Inventors:

Julien Perez, Grenoble, FR;

Tomi Silander, Grenoble, FR;

Assignee:

Xerox Corporation, Norwalk, CT (US);

Attorney:
Primary Examiner:
Int. Cl.
CPC ...
G06N 3/08 (2006.01); G05B 13/02 (2006.01); G06N 3/04 (2006.01); G06N 3/00 (2006.01); G06N 7/00 (2006.01); G06N 20/00 (2019.01);
U.S. Cl.
CPC ...
G06N 3/08 (2013.01); G05B 13/027 (2013.01); G06N 3/006 (2013.01); G06N 3/0445 (2013.01); G06N 3/0454 (2013.01); G06N 7/005 (2013.01); G06N 20/00 (2019.01);
Abstract

A system and method for predicting a sequence of actions employ a Gated End-to-End Memory Policy Network (GMemN2NP), which includes a sequence of hop(s). Supporting memories of the hops include memory cells generated from observations made at different times. A sequence of actions is predicted, based on input agent-specific variables. For each action, the model, at each hop, outputs an updated controller state which is used as input to the next hop or, for the terminal hop, for computing the respective action. Each hop includes a transform gate mechanism which is used to control the influence of output of the supporting memories on the updated controller state. For the second and subsequent hops, respective actions are predicted, after using any intervening observations to update the supporting memories. The model is learned, on a training set of observations, to optimize the cumulative reward of a sequence of two or more actions.


Find Patent Forward Citations

Loading…