The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.

The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.

Date of Patent:
Apr. 14, 2025

Filed:

Sep. 28, 2020
Applicants:

Sony Corporation, Tokyo, JP;

Sony Corporation of America, New York, NY (US);

Inventors:

Varun Kompella, Kanata, CA;

James MacGlashan, Riverside, RI (US);

Peter Wurman, Acton, MA (US);

Peter Stone, Austin, TX (US);

Assignee:
Attorneys:
Primary Examiner:
Int. Cl.
CPC ...
G06N 7/00 (2022.12); G06F 18/21 (2022.12); G06F 18/214 (2022.12); G06N 20/00 (2018.12);
U.S. Cl.
CPC ...
G06F 18/2178 (2022.12); G06F 18/214 (2022.12); G06N 20/00 (2018.12);
Abstract

A task prioritized experience replay (TaPER) algorithm enables simultaneous learning of multiple RL tasks off policy. The algorithm can prioritize samples that were part of fixed length episodes that led to the achievement of tasks. This enables the agent to quickly learn task policies by bootstrapping over its early successes. Finally, TaPER can improve performance on all tasks simultaneously, which is a desirable characteristic for multi-task RL. Unlike conventional ER algorithms that are applied to single RL task learning settings or that require rewards to be binary or abundant, or are provided as a parameterized specification of goals, TaPER poses no such restrictions and supports arbitrary reward and task specifications.


Find Patent Forward Citations

Loading…