The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.

The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.

Date of Patent:
Nov. 11, 2025

Filed:

Apr. 29, 2025
Applicant:

Gdm Holding Llc, Mountain View, CA (US);

Inventors:

Mostafa Dehghani, Amsterdam, NL;

Phillip Lippe, Amsterdam, NL;

Emiel Hoogeboom, Amsterdam, NL;

Jonathan Heek, Hilversum, NL;

Assignee:

GDM Holding LLC, Mountain View, CA (US);

Attorney:
Primary Examiner:
Int. Cl.
CPC ...
G06N 20/00 (2019.01); G06F 40/284 (2020.01); G06F 40/40 (2020.01); G06T 11/00 (2006.01); G06V 10/774 (2022.01); G06V 10/82 (2022.01); G06V 30/19 (2022.01); G10L 25/18 (2013.01); G10L 25/30 (2013.01);
U.S. Cl.
CPC ...
G06T 11/00 (2013.01); G06F 40/284 (2020.01); G06F 40/40 (2020.01); G06V 10/774 (2022.01); G06V 10/82 (2022.01); G06V 30/19147 (2022.01); G10L 25/18 (2013.01); G10L 25/30 (2013.01);
Abstract

A computer-implemented method of generating multimodal data. The method comprises using a token generation neural network to generate, autoregressively, an output sequence of multimodal tokens, and in response to a next multimodal token being a start-of-image token, generating an image using an image generation subsystem conditioned on features representing the current sequence of multimodal tokens obtained from the token generation neural network. The method further comprises processing the image to convert pixels of the image into a sequence of image tokens, each image token comprising a block encoding of values of the pixels in a different region of the image that maps a set of values of the pixels to a respective image token, and appending the sequence of image tokens to the current output sequence of multimodal tokens as the next multimodal tokens in the output sequence of multimodal tokens.


Find Patent Forward Citations

Loading…