The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.

The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.

Date of Patent:
Oct. 11, 2022

Filed:

Jun. 19, 2018
Applicant:

International Business Machines Corporation, Armonk, NY (US);

Inventors:

Subhajit Chaudhury, Kawasaki, JP;

Daiki Kimura, Tokyo, JP;

Tadanobu Inoue, Yokohama, JP;

Ryuki Tachibana, Yokohama, JP;

Attorneys:
Primary Examiner:
Int. Cl.
CPC ...
G06N 3/08 (2006.01); G06N 3/04 (2006.01);
U.S. Cl.
CPC ...
G06N 3/084 (2013.01); G06N 3/049 (2013.01); G06N 3/0472 (2013.01); G06N 3/0481 (2013.01);
Abstract

A computer-implemented method is provided for learning an action policy. The method includes obtaining, by a processor, environment dynamics including triplets of a state, an action, and a next state. The state in each of the triplets is an expert state. The method further includes training, by the processor using the environment dynamics as training data, a dynamics model which obtains a pair of the state and the action as an input and outputs, for each next state, state-transition probabilities. The method also includes learning, by the processor, the action policy using trajectories of expert states according to a supervised learning technique by back-propagating error gradients through the trained dynamics model.


Find Patent Forward Citations

Loading…