The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.

The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.

Date of Patent:
May. 30, 2023

Filed:

Dec. 24, 2020
Applicant:

Beijing Baidu Netcom Science and Technology CO Ltd, Beijing, CN;

Inventors:

Zhe Hu, Beijing, CN;

Cheng Peng, Beijing, CN;

Xuefeng Luo, Beijing, CN;

Attorney:
Primary Examiner:
Int. Cl.
CPC ...
G06F 16/35 (2019.01); G06F 16/242 (2019.01); G06F 16/22 (2019.01); G06F 16/2455 (2019.01); G06V 30/414 (2022.01); G06F 18/214 (2023.01);
U.S. Cl.
CPC ...
G06F 16/35 (2019.01); G06F 16/2237 (2019.01); G06F 16/243 (2019.01); G06F 16/24556 (2019.01); G06F 18/2148 (2023.01); G06V 30/414 (2022.01);
Abstract

The present disclosure discloses a method and apparatus for processing a dataset. The method includes: obtaining a first text set meeting a preset similarity matching condition with a target text from multiple text blocks provided by a target user; obtaining a second text set from the first text set, in which each text in the second text set does not belong to a same text block as the target text; generating a negative sample set of the target text based on content of a candidate text block to which each text in the second text set belongs; generating a positive sample set of the target text based on content of a target text block to which the target text belongs; and generating a dataset of the target user based on the negative sample set and the positive sample set, and training a matching model based on the dataset.


Find Patent Forward Citations

Loading…