The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.

The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.

Date of Patent:
Feb. 14, 2023

Filed:

Feb. 07, 2022
Applicant:

Mitsubishi Electric Research Laboratories, Inc., Cambridge, MA (US);

Inventors:

Anoop Cherian, Cambridge, MA (US);

Chiori Hori, Lexington, MA (US);

Jonathan Le Roux, Arlington, MA (US);

Tim Marks, Middleton, MA (US);

Alan Sullivan, Middleton, MA (US);

Attorneys:
Primary Examiner:
Int. Cl.
CPC ...
H04N 19/597 (2014.01); H04N 19/179 (2014.01); H04N 19/107 (2014.01); G06F 16/532 (2019.01); G06T 7/11 (2017.01); G06V 30/194 (2022.01); H04N 21/25 (2011.01);
U.S. Cl.
CPC ...
H04N 19/597 (2014.11); G06F 16/532 (2019.01); G06T 7/11 (2017.01); G06V 30/194 (2022.01); H04N 19/107 (2014.11); H04N 19/179 (2014.11); H04N 21/251 (2013.01);
Abstract

Embodiments of the present disclosure discloses a scene-aware video encoder system. The scene-aware encoder system transforms a sequence of video frames of a video of a scene into a spatio-temporal scene graph. The spatio-temporal scene graph includes nodes representing one or multiple static and dynamic objects in the scene. Each node of the spatio-temporal scene graph describes an appearance, a location, and/or a motion of each of the objects (static and dynamic objects) at different time instances. The nodes of the spatio-temporal scene graph are embedded into a latent space using a spatio-temporal transformer encoding different combinations of different nodes of the spatio-temporal scene graph corresponding to different spatio-temporal volumes of the scene. Each node of the different nodes encoded in each of the combinations is weighted with an attention score determined as a function of similarities of spatio-temporal locations of the different nodes in the combination.


Find Patent Forward Citations

Loading…