The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.

The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.

Date of Patent:
Jul. 05, 2022

Filed:

Jun. 14, 2019
Applicant:

Tencent Technology (Shenzhen) Company Limited, Shenzhen, CN;

Inventors:

Wei Xu, Shenzhen, CN;

Li Zhong, Shenzhen, CN;

Li Wang, Shenzhen, CN;

Lichun Liu, Shenzhen, CN;

Attorney:
Primary Examiner:
Assistant Examiner:
Int. Cl.
CPC ...
G06F 16/174 (2019.01); G06F 16/31 (2019.01); G06F 40/30 (2020.01); G06F 40/211 (2020.01); G06F 40/289 (2020.01);
U.S. Cl.
CPC ...
G06F 16/1748 (2019.01); G06F 16/319 (2019.01); G06F 40/211 (2020.01); G06F 40/289 (2020.01); G06F 40/30 (2020.01);
Abstract

A text deduplication method and apparatus, and a storage medium are provided. The method includes: obtaining a text set, the text set including a plurality of to-be-deduplicated texts; capturing, for each to-be-deduplicated text, a corresponding subtext string from the to-be-deduplicated text; and determining, in the text set, to-be-deduplicated texts having a same subtext string, to obtain text subsets. Each subtext string corresponds to a text subset, and each text subset includes one or more to-be-deduplicated texts that have the corresponding subtext string. The method also includes performing text deduplication processing on the text subset corresponding to each subtext string, to obtain a deduplicated text set corresponding to each subtext string; and obtaining, according to the deduplicated text set corresponding to each subtext string, a result text set of the text set after the deduplication.


Find Patent Forward Citations

Loading…