The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.
The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.
Patent No.:
Date of Patent:
Aug. 12, 2025
Filed:
Jun. 09, 2023
Robert Bosch Gmbh, Stuttgart, DE;
Carnegie Mellon University, Pittsburgh, PA (US);
Yutong He, Pittsburgh, PA (US);
Ruslan Salakhutdinov, Pittsburgh, PA (US);
Jeremy Kolter, Pittsburgh, PA (US);
Marcus Pereira, Pittsburgh, PA (US);
João D. Semedo, Pittsburgh, PA (US);
Bahare Azari, San Jose, CA (US);
Filipe J. Cabrita Condessa, Pittsburgh, PA (US);
Robert Bosch GmbH, , DE;
Carnegie Mellon University, Pittsburgh, PA (US);
Abstract
A method discloses receiving, at a cross-attention layer of a model, first text data describing a first object and second text data describing a first scene, wherein the first text data includes a description of a location of the first object, utilizing the model with cross-attention layers, concatenating the first text data and the second text data to generate a prompt; generating, a broadcasted location mask constructed from at least the location; generating, a broadcasted all-one matrix associated with the second text data described the first scene; computing a key matrix and a value matrix utilizing separate linear projections of the prompt; computing a query matrix utilizing linear projections; generating a broadcasted location matrix in response to concatenating the broadcasted location mask and the broadcasted all-one matrix; generating a cross-attention map utilizing the query matrix, the key matrix, and the broadcasted location matrix; and outputting a final image.