The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.

The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.

Date of Patent:
Feb. 21, 2023

Filed:

May. 07, 2018
Applicant:

Telefonaktiebolaget Lm Ericsson (Publ), Stockholm, SE;

Inventors:

Wei Huang, Sollentuna, SE;

Wenfeng Hu, Täby, SE;

Tobias Ley, Täby, SE;

Martha Vlachou-Konchylaki, Stockholm, SE;

Assignee:
Attorney:
Primary Examiner:
Int. Cl.
CPC ...
G06N 3/00 (2006.01); G06N 3/08 (2023.01); G06N 3/04 (2023.01); G06N 3/10 (2006.01); G06N 3/084 (2023.01);
U.S. Cl.
CPC ...
G06N 3/08 (2013.01); G06N 3/0454 (2013.01); G06N 3/0472 (2013.01); G06N 3/084 (2013.01); G06N 3/105 (2013.01);
Abstract

A pre-training apparatus and method for reinforcement learning based on a Generative Adversarial Network (GAN) is provided. GAN includes a generator and a discriminator. The method comprising receiving training data from a real environment where the training data includes a data slice corresponding to a first state-reward pair and a first state-action pair, training the GAN using the training data, training a relations network to extract a latent relationship of the first state-action pair with the first state-reward pair in a reinforcement learning context, causing the generator trained with training data to generate first synthetic data, processing a portion of the first synthetic data in the relations network to generate a resulting data slice, merging the second state-action pair portion of the first synthetic data with the second state-reward pair from the relations network to generate second synthetic data to update a policy for interaction with the real environment.


Find Patent Forward Citations

Loading…