The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.

The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.

Date of Patent:
Oct. 03, 2023

Filed:

Jan. 05, 2022
Applicant:

Splunk Inc., San Francisco, CA (US);

Inventors:

R. David Carasso, San Rafael, CA (US);

Micah James Delfino, San Francisco, CA (US);

Assignee:

SPLUNK INC., San Francisco, CA (US);

Attorney:
Primary Examiner:
Int. Cl.
CPC ...
G06F 16/25 (2019.01); G06F 16/35 (2019.01); G06F 16/28 (2019.01); G06F 16/904 (2019.01); G06F 7/24 (2006.01); G06F 3/0482 (2013.01); G06F 3/04842 (2022.01); G06F 3/0488 (2022.01);
U.S. Cl.
CPC ...
G06F 16/254 (2019.01); G06F 3/0482 (2013.01); G06F 3/04842 (2013.01); G06F 7/24 (2013.01); G06F 16/287 (2019.01); G06F 16/35 (2019.01); G06F 16/904 (2019.01); G06F 3/0488 (2013.01);
Abstract

Embodiments are directed towards generating a representative sampling as a subset from a larger dataset that includes unstructured data. A graphical user interface enables a user to provide various data selection parameters, including specifying a data source and one or more subset types desired, including one or more of latest records, earliest records, diverse records, outlier records, and/or random records. Diverse and/or outlier subset types may be obtained by generating clusters from an initial selection of records obtained from the larger dataset. An iteration analysis is performed to determine whether a sufficient number of clusters and/or cluster types have been generated that exceed at least one threshold and when not exceeded, additional clustering is performed on additional records. From the resultant clusters, and/or other subtype results, a subset of records is obtained as the representative sampling subset.


Find Patent Forward Citations

Loading…