The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.

The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.

Patent No.:

US 12124230 B1

Date of Patent:

Oct. 22, 2024

Filed:

Dec. 10, 2021

System and method for polytopic policy optimization for robust feedback control during learning

Applicant:

Mitsubishi Electric Research Laboratories, Inc., Cambridge, MA (US);

Inventors:

Devesh Jha, Cambridge, MA (US);

Ankush Chakrabarty, Cambridge, MA (US);

Assignee:

Mitsubishi Electric Research Laboratories, Inc., Cambridge, MA (US);

Attorney:

Gene Vinokur

Primary Examiner:

Christopher E Everett

Int. Cl.

CPC ...

G05B 13/04 (2006.01); G05B 13/02 (2006.01); G06F 30/27 (2020.01); G06N 7/01 (2023.01);

U.S. Cl.

CPC ...

G05B 13/04 (2013.01); G05B 13/0265 (2013.01); G06F 30/27 (2020.01); G06N 7/01 (2023.01);

Abstract

A controller is provided for generating a policy controlling a system by learning a dynamics of the system. The controller is configured to perform steps of acquiring measurement data from sensors arranged on the system, providing, to the memory, a non-linear system model represented by known part of the dynamics of the system and unknown part of the dynamics of the system, collecting states of the system by measuring the dynamics of the system using the sensors of the system based on a nominal policy and a noise term with respect to the states, estimating a sequence of sets of states of the system and sets of control inputs by collecting data of the system, wherein the data includes a collection of system states, applied control inputs and change in system states, wherein each of the control input is computed by the nominal policy and the additional noise term, learning a polytopic system by use of the collected data of the system for approximating the unknown part of the dynamics of the system using a linear probabilistic regression model, estimating an attractor basin of a terminal controller by sampling initial states in a neighborhood of a terminal state and estimating the attractor basin by supervised learning, and generating a polytopic policy using the estimated polytopic system to drive the system to the attractor basin of the terminal controller from an initial state.

Find Patent Forward Citations