The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.

The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.

Date of Patent:
Nov. 25, 2025

Filed:

Jun. 17, 2024
Applicant:

Amazon Technologies, Inc., Seattle, WA (US);

Inventors:

Ahmet Emre Barut, Boston, MA (US);

Chengwei Su, Belmont, MA (US);

Weitong Ruan, Revere, MA (US);

Wael Hamza, Yorktown Heights, NY (US);

Assignee:

AMAZON TECHNOLOGIES, INC., Seattle, WA (US);

Attorney:
Primary Examiner:
Int. Cl.
CPC ...
G06F 16/00 (2019.01); G06F 16/532 (2019.01); G06F 16/583 (2019.01); G06F 16/9032 (2019.01); G06V 20/20 (2022.01); G06N 20/00 (2019.01);
U.S. Cl.
CPC ...
G06F 16/90332 (2019.01); G06F 16/532 (2019.01); G06F 16/583 (2019.01); G06V 20/20 (2022.01); G06N 20/00 (2019.01);
Abstract

Devices and techniques are generally described for selection of objects in image data using natural language input. In various examples, first image data representing at least a first object and first natural language data may be received. In some examples, first embedding data representing the first natural language data may be generated. Second embedding data representing the first image data may be generated. Relative location data indicating a location of the first object in the first image data relative to at least one other object may be generated. The first embedding data, the second embedding data, and the relative location data may be input into a multi-modal transformer model. The multi-modal transformer model may determine that the first natural language data relates to the first object.


Find Patent Forward Citations

Loading…