The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.

G06T 11/60 (2006.01); G06F 16/432 (2019.01); G06F 16/438 (2019.01); G06F 40/284 (2020.01); G06F 40/40 (2020.01); G06N 3/0475 (2023.01);

U.S. Cl.

CPC ...

G06T 11/60 (2013.01); G06F 16/432 (2019.01); G06F 16/438 (2019.01); G06F 40/284 (2020.01); G06F 40/40 (2020.01); G06N 3/0475 (2023.01);

Abstract

A method discloses receiving, at a cross-attention layer of a model, first text data describing a first object and second text data describing a first scene, wherein the first text data includes a description of a location of the first object, utilizing the model with cross-attention layers, concatenating the first text data and the second text data to generate a prompt; generating, a broadcasted location mask constructed from at least the location; generating, a broadcasted all-one matrix associated with the second text data described the first scene; computing a key matrix and a value matrix utilizing separate linear projections of the prompt; computing a query matrix utilizing linear projections; generating a broadcasted location matrix in response to concatenating the broadcasted location mask and the broadcasted all-one matrix; generating a cross-attention map utilizing the query matrix, the key matrix, and the broadcasted location matrix; and outputting a final image.

Find Patent Forward Citations