The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.

The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.

Patent No.:

US 11876986 B1

Date of Patent:

Jan. 16, 2024

Filed:

Nov. 29, 2022

Hierarchical video encoders

Applicant:

Google Llc, Mountain View, CA (US);

Inventors:

Vihan Jain, San Francisco, CA (US);

Joonseok Lee, Fremont, CA (US);

Ming Zhao, Sunnyvale, CA (US);

Sheide Chammas, San Francisco, CA (US);

Hexiang Hu, Los Angeles, CA (US);

Bowen Zhang, Los Angeles, CA (US);

Fei Sha, Los Angeles, CA (US);

Tze Way Eugene Ie, Los Altos, CA (US);

Assignee:

GOOGLE LLC, Mountain View, CA (US);

Attorney:

Dority & Manning, P.A.

Primary Examiner:

Samuel D Fereja

Int. Cl.

CPC ...

H04N 19/30 (2014.01); H04N 19/00 (2014.01); H04N 19/172 (2014.01); G06N 20/00 (2019.01);

U.S. Cl.

CPC ...

H04N 19/30 (2014.11); G06N 20/00 (2019.01); H04N 19/172 (2014.11);

Abstract

A computer-implemented method for generating video representations utilizing a hierarchical video encoder includes obtaining a video, wherein the video includes a plurality of frames, processing each of the plurality of frames with a machine-learned frame-level encoder model to respectively generate a plurality of frame representations for the plurality of frames, the plurality of frame representations respective to the plurality of frames determining a plurality of segment representations representative of a plurality of video segments including one or more of the plurality of frames, the plurality of segment representations based at least in part on the plurality of frame representations, processing the plurality of segment representations with a machine-learned segment-level encoder model to generate a plurality of contextualized segment representations, determining a video representation based at least in part on the plurality of contextualized segment representations, and providing the video representation as an output.

Find Patent Forward Citations