The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.

The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.

Patent No.:

US 12153642 B1

Date of Patent:

Nov. 26, 2024

Filed:

Aug. 16, 2023

Automatic navigation of interactive web documents

Applicant:

Google Llc, Mountain View, CA (US);

Inventors:

Aleksandra Faust, Palo Alto, CA (US);

Dilek Hakkani-Tur, Los Altos, CA (US);

Izzeddin Gur, Goleta, CA (US);

Ulrich Rueckert, San Francisco, CA (US);

Assignee:

GOOGLE LLC, Mountain View, CA (US);

Attorney:

Gray Ice Higdon

Primary Examiner:

Mark E Hershley

Int. Cl.

CPC ...

G06F 16/954 (2019.01); G06F 16/953 (2019.01); G06N 3/04 (2023.01);

U.S. Cl.

CPC ...

G06F 16/954 (2019.01); G06F 16/953 (2019.01); G06N 3/04 (2013.01);

Abstract

The present disclosure is generally directed to methods, apparatus, and computer-readable media (transitory and non-transitory) for learning to automatically navigate interactive web documents and/or websites. More particularly, various approaches are presented for training various deep Q network (DQN) agents to perform various tasks associated with reinforcement learning, including hierarchical reinforcement learning, in challenging web navigation environments with sparse rewards and large state and action spaces. These agents include a web navigation agent that can use learned value function(s) to automatically navigate through interactive web documents, as well as a training agent, referred to herein as a 'meta-trainer,' that can be trained to generate synthetic training examples. Some approaches described herein may be implemented when expert demonstrations are available. Other approaches described herein may be implemented when expert demonstrations are not available. In either case, dense, potential-based rewards may be used to augment the training.

Find Patent Forward Citations