The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.

The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.

Date of Patent:
Mar. 03, 2020

Filed:

Feb. 07, 2018
Applicant:

Baidu International Technology (Shenzhen) Co., Ltd., Shenzhen, Guangdong Province, CN;

Inventors:

Siqi Bao, Shenzhen, CN;

Zeyu Chen, Shenzhen, CN;

Di Jiang, Shenzhen, CN;

Jingzhou He, Shenzhen, CN;

Assignee:

Baidu International Technology (Shenzhen) Co., Ltd., Shenzhen, Guangdong Province, CN;

Attorney:
Primary Examiner:
Int. Cl.
CPC ...
G06F 16/00 (2019.01); G06F 16/33 (2019.01); G06F 17/18 (2006.01); G06F 16/31 (2019.01); G06N 20/00 (2019.01); G06N 5/00 (2006.01); G06F 40/10 (2020.01); G06F 40/20 (2020.01); G06F 40/30 (2020.01); G06N 7/00 (2006.01);
U.S. Cl.
CPC ...
G06F 16/3334 (2019.01); G06F 16/313 (2019.01); G06F 17/18 (2013.01); G06F 40/10 (2020.01); G06F 40/20 (2020.01); G06F 40/30 (2020.01); G06N 5/00 (2013.01); G06N 20/00 (2019.01); G06N 7/005 (2013.01);
Abstract

A method comprises: acquiring a to-be-compressed topic model, wherein each line of the topic model represents a distribution of a word among respective topics; performing a format conversion on the topic model to obtain a first topic model, wherein each line of the first topic model represents a distribution of a topic among respective words; selecting any two topics from the first topic model to form a topic pair, forming a topic pair set using at least one topic pair, and determining a similarity between the two topics in each topic pair in the topic pair set; merging topic pairs having a similarity greater than a similarity threshold to generate a second topic model; and performing a format conversion on the second topic model to obtain a compressed topic model, so that each line of the compressed topic model represents a distribution of a word among the respective topics.


Find Patent Forward Citations

Loading…