The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.

The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.

Date of Patent:
Dec. 05, 2017

Filed:

Oct. 06, 2014
Applicant:

International Business Machines Corporation, Armonk, NY (US);

Inventors:

Alexey Tsitkin, Petach Tikva, IL;

Segev E Wasserkrug, Haifa, IL;

Alexander Zadorojniy, Haifa, IL;

Sergey Zeltyn, Haifa, IL;

Attorney:
Primary Examiner:
Int. Cl.
CPC ...
G06F 7/60 (2006.01); G06F 17/10 (2006.01); G06N 7/00 (2006.01);
U.S. Cl.
CPC ...
G06N 7/005 (2013.01);
Abstract

A method for determining a variable near-optimal policy for a problem formulated as Markov Decision Process, the problem comprising at least one limited action entry, the limited action entry being an entry of an action of a finite set of actions limited in the number of times its value may be changed, the method comprising using at least one hardware processor for: receiving data elements with respect to the problem, the data elements comprising: (a) a finite set of states, (b) the finite set of actions, (c) a transition probabilities matrix determining transition probabilities between states of the finite set of states, once actions of the set of actions are performed; (d) an immediate cost function, wherein the value of the immediate cost function is determined for a pair of a state of the finite set of states and an action of the finite set of actions, and (e) a discount factor; updating one or more data elements of the received data elements relating to the at least one limited action entry, wherein the one or more data elements are selected from the group consisting of: the transition probabilities matrix, the immediate cost function and the discount factor, and wherein the updating is triggered by a change of a value of a limited action entry of the at least one limited action entry; and following the updating of the one or more data elements, calculating a current near-optimal policy for the problem based on the updated one or more data elements.


Find Patent Forward Citations

Loading…