The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.

The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.

Date of Patent:
Jan. 20, 2026

Filed:

Jun. 22, 2022
Applicant:

Amazon Technologies, Inc., Seattle, WA (US);

Inventors:

Zijian Wang, San Jose, CA (US);

Yuchen Tian, Santa Clara, CA (US);

Mingyue Shang, Jersey City, NJ (US);

Praphruetpong Athiwaratkun, Jersey City, NJ (US);

Ming Tan, Jersey City, NJ (US);

Parminder Bhatia, Kearny, NJ (US);

Andrew Oliver Arnold, New York, NY (US);

Ramesh M Nallapati, Frfemont, CA (US);

Sudipta Sengupta, Sammamish, WA (US);

Bing Xiang, Mount Kisco, NY (US);

Atul Deo, Kirkland, WA (US);

Ankur Deepak Desai, Redmond, WA (US);

Assignee:

Amazon Technologies, Inc., Seattle, WA (US);

Attorneys:
Primary Examiner:
Int. Cl.
CPC ...
G06F 9/44 (2018.01); G06F 8/30 (2018.01); G06F 8/41 (2018.01); G06F 9/445 (2018.01); G06F 9/45 (2006.01); G06F 9/455 (2018.01); G06F 40/284 (2020.01); G06N 20/00 (2019.01);
U.S. Cl.
CPC ...
G06F 40/284 (2020.01); G06F 8/30 (2013.01); G06F 8/427 (2013.01); G06N 20/00 (2019.01);
Abstract

Random token segmentation may be implemented for next token prediction. Text data may be received for training a machine learning model to predict a next token given input text tokens. Multiple tokens may be determined from the text data. Different ones of the multiple token may be randomly segmented in to sub-tokens. The machine learning model may then be trained using the multiple tokens including the respective sub-tokens as a training data set.


Find Patent Forward Citations

Loading…