The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.

The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.

Date of Patent:
Nov. 15, 2022

Filed:

Jul. 30, 2018
Applicant:

International Business Machines Corporation, Armonk, NY (US);

Inventors:

Tu-Hoa Pham, Tokyo, JP;

Don Joven Ravoy Agravante, Tokyo, JP;

Giovanni De Magistris, Tokyo, JP;

Ryuki Tachibana, Yokohama, JP;

Attorneys:
Primary Examiner:
Int. Cl.
CPC ...
G06N 3/08 (2006.01); G06N 3/04 (2006.01);
U.S. Cl.
CPC ...
G06N 3/08 (2013.01); G06N 3/04 (2013.01);
Abstract

A method is provided for reinforcement learning. The method includes obtaining, by a processor device, a first set and a second set of state-action tuples. Each of the state-action tuples in the first set represents a respective good demonstration. Each of the state-action tuples in the second set represents a respective bad demonstration. The method further includes training, by the processor device using supervised learning with the first set and the second set, a neural network which takes as input a state to provide an output. The output is parameterized to obtain each of a plurality of real-valued constraint functions used for evaluation of each of a plurality of action constraints. The method also includes training, by the processor device, a policy using reinforcement learning by restricting actions predicted by the policy according to each of the plurality of action constraints with each of the plurality of real-valued constraint functions.


Find Patent Forward Citations

Loading…