The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.

The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.

Date of Patent:
Aug. 29, 2023

Filed:

May. 24, 2021
Applicant:

Tencent Technology (Shenzhen) Company Limited, Shenzhen, CN;

Inventors:

Wenjie Pei, Shenzhen, CN;

Jiyuan Zhang, Shenzhen, CN;

Lei Ke, Shenzhen, CN;

Yuwing Tai, Shenzhen, CN;

Xiaoyong Shen, Shenzhen, CN;

Jiaya Jia, Shenzhen, CN;

Xiangrong Wang, Shenzhen, CN;

Attorney:
Primary Examiner:
Int. Cl.
CPC ...
G06K 9/36 (2006.01); G06K 9/46 (2006.01); H04N 21/488 (2011.01); H04N 5/278 (2006.01); G06V 20/40 (2022.01); G06F 18/22 (2023.01); G06F 18/28 (2023.01); G06F 18/25 (2023.01); G06V 10/75 (2022.01); G06V 10/772 (2022.01); G06V 20/62 (2022.01); H04N 21/234 (2011.01); H04N 21/235 (2011.01); H04N 21/435 (2011.01); H04N 21/8549 (2011.01);
U.S. Cl.
CPC ...
H04N 21/4884 (2013.01); G06F 18/22 (2023.01); G06F 18/253 (2023.01); G06F 18/28 (2023.01); G06V 10/75 (2022.01); G06V 10/772 (2022.01); G06V 20/41 (2022.01); G06V 20/47 (2022.01); G06V 20/635 (2022.01); H04N 5/278 (2013.01); H04N 21/235 (2013.01); H04N 21/23418 (2013.01); H04N 21/435 (2013.01); H04N 21/488 (2013.01); H04N 21/8549 (2013.01);
Abstract

A video caption generating method is provided to a computer device. The method includes encoding a target video by using an encoder of a video caption generating model, to obtain a target visual feature of the target video, decoding the target visual feature by using a basic decoder of the video caption generating model, to obtain a first selection probability corresponding to a candidate word, decoding the target visual feature by using an auxiliary decoder of the video caption generating model, to obtain a second selection probability corresponding to the candidate word, a memory structure of the auxiliary decoder including reference visual context information corresponding to the candidate word, determining a decoded word in the candidate word according to the first selection probability and the second selection probability, and generating a video caption according to decoded word.


Find Patent Forward Citations

Loading…