The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.

The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.

Date of Patent:
Nov. 19, 2024

Filed:

Jan. 14, 2022
Applicant:

Adobe Inc., San Jose, CA (US);

Inventors:

Ruiyi Zhang, San Jose, CA (US);

Yufan Zhou, Buffalo, NY (US);

Christopher Tensmeyer, Columbia, MD (US);

Jiuxiang Gu, College Park, MD (US);

Tong Yu, San Jose, CA (US);

Tong Sun, San Jose, CA (US);

Assignee:

Adobe Inc., San Jose, CA (US);

Attorney:
Primary Examiner:
Int. Cl.
CPC ...
G06T 5/00 (2024.01); G06N 3/04 (2023.01); G06T 3/10 (2024.01); G06T 5/60 (2024.01); G06T 11/00 (2006.01); G06T 11/80 (2006.01); G10L 15/22 (2006.01); G10L 15/26 (2006.01);
U.S. Cl.
CPC ...
G06T 3/10 (2024.01); G06N 3/04 (2013.01); G06T 11/00 (2013.01); G10L 15/22 (2013.01); G10L 15/26 (2013.01); G10L 2015/223 (2013.01);
Abstract

The present disclosure relates to systems, non-transitory computer-readable media, and methods that implement a neural network framework for interactive multi-round image generation from natural language inputs. Specifically, the disclosed systems provide an intelligent framework (i.e., a text-based interactive image generation model) that facilitates a multi-round image generation and editing workflow that comports with arbitrary input text and synchronous interaction. In particular embodiments, the disclosed systems utilize natural language feedback for conditioning a generative neural network that performs text-to-image generation and text-guided image modification. For example, the disclosed systems utilize a trained model to inject textual features from natural language feedback into a unified joint embedding space for generating text-informed style vectors. In turn, the disclosed systems can generate an image with semantically meaningful features that map to the natural language feedback. Moreover, the disclosed systems can persist these semantically meaningful features throughout a refinement process and across generated images.


Find Patent Forward Citations

Loading…