The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.

The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.

Date of Patent:
Aug. 17, 2021

Filed:

Apr. 29, 2019
Applicant:

Honda Motor Co., Ltd., Tokyo, JP;

Inventors:

Yeping Hu, Albany, CA (US);

Alireza Nakhaei Sarvedani, Sunnyvale, CA (US);

Masayoshi Tomizuka, Berkeley, CA (US);

Kikuo Fujimura, Palo Alto, CA (US);

Assignee:
Attorney:
Primary Examiner:
Int. Cl.
CPC ...
G06N 3/08 (2006.01); G05D 1/00 (2006.01); B60W 10/04 (2006.01); B60W 10/18 (2012.01); B60W 10/20 (2006.01); B60W 30/18 (2012.01); B60W 50/00 (2006.01);
U.S. Cl.
CPC ...
G06N 3/08 (2013.01); B60W 10/04 (2013.01); B60W 10/18 (2013.01); B60W 10/20 (2013.01); B60W 30/18109 (2013.01); B60W 30/18163 (2013.01); B60W 50/00 (2013.01); G05D 1/0088 (2013.01); B60W 2050/0014 (2013.01); B60W 2556/00 (2020.02); B60W 2710/18 (2013.01); B60W 2710/20 (2013.01); B60W 2720/106 (2013.01);
Abstract

Interaction-aware decision making may include training a first agent based on a first policy gradient, training a first critic based on a first loss function to learn goals in a single-agent environment using a Markov decision process, training a number N of agents based on the first policy gradient, training a second policy gradient and a second critic based on the first loss function and a second loss function to learn goals in a multi-agent environment using a Markov game to instantiate a second agent neural network, and generating an interaction-aware decision making network policy based on the first agent neural network and the second agent neural network. The N number of agents may be associated with a driver type indicative of a level of cooperation. When a collision occurs, a negative reward or penalty may be assigned to each agent involved based on a lane priority level of respective agents.


Find Patent Forward Citations

Loading…